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Preface to the Second, Extended Edition 


New topics have been included in the second edition. They reflect recent 
progress in the field of cryptography and supplement the material covered in 
the first edition. Major extensions and enhancements are the following. 


e A complete description of the Advanced Encryption Standard AES is given 
in Chapter 2 on symmetric encryption. 

e In Appendix A, there is a new section on polynomials and finite fields. 
There we offer a basic explanation of finite fields, which is necessary to 
understand the AES. 

e The description of cryptographic hash functions in Chapter 3 has been 
extended. It now also includes, for example, the HMAC construction of 
message authentication codes. 

e Bleichenbacher’s 1-Million-Chosen-Ciphertext Attack against schemes 
that implement the RSA encryption standard PKCS#1 is discussed in 
detail in Chapter 3. This attack proves that adaptively-chosen-ciphertext 
attacks can be a real danger in practice. 

e In Chapter 9 on provably secure encryption we have added typical secu- 
rity proofs for public-key encryption schemes that resist adaptively-chosen- 
ciphertext attacks. Two prominent examples are studied — Boneh’s simple- 
OAEP, or SAEP for short, and Cramer-Shoup’s public key encryption. 

e Security proofs in the random oracle model are now included. Full-domain- 
hash RSA signatures and SAEP serve as examples. 


Furthermore, the text has been updated and clarified at various points. 
Errors and inaccuracies have been corrected. 


We thank our readers and our students for their comments and hints, and 
we are indebted to our colleague Patricia Shiroma-Brockmann and Ronan 
Nugent at Springer for proof-reading the English copy of the new and revised 
chapters. 


Nurnberg, December 2006 Hans Delfs, Helmut Knebl 


Preface 


The rapid growth of electronic communication means that issues in infor- 
mation security are of increasing practical importance. Messages exchanged 
over worldwide publicly accessible computer networks must be kept confiden- 
tial and protected against manipulation. Electronic business requires digital 
signatures that are valid in law, and secure payment protocols. Modern cryp- 
tography provides solutions to all these problems. 

This book originates from courses given for students in computer science 
at_ the Georg-Simon-Ohm University of Applied Sciences, Niirnberg. It is in- 
tended as a course on cryptography for advanced undergraduate and graduate 
students in computer science, mathematics and electrical engineering. 

In its first part (Chapters 1-4), it covers — at an undergraduate level — the 
key concepts from symmetric and asymmetric encryption, digital signatures 
and cryptographic protocols, including, for example, identification schemes, 
electronic elections and digital cash. The focus is on asymmetric cryptography 
and the underlying modular algebra. Since we avoid probability theory in 
the first part, we necessarily have to work with informal definitions of, for 
example, one-way functions and collision-resistant hash functions. 

It is the goal of the second part (Chapters 5-10) to show, using prob- 
ability theory, how basic notions like the security of cryptographic schemes 
and the one-way property of functions can be made precise, and which as- 
sumptions guarantee the security of public-key cryptographic schemes such 
as RSA. More advanced topics, like the bit security of one-way functions, 
computationally perfect pseudorandom generators and the close relation be- 
tween the randomness and security of cryptographic schemes, are addressed. 
Typical examples of provably secure encryption and signature schemes and 
their security proofs are given. 

Though particular attention is given to the mathematical foundations 
and, in the second part, precise definitions, no special background in math- 
ematics is presumed. An introductory course typically taught for beginning 
students in mathematics and computer science is sufficient. The reader should 
be familiar with the elementary notions of algebra, such as groups, rings and 
fields, and, in the second part, with the basics of probability theory. Appendix 
A contains an exposition of the results from algebra and number theory nec- 
essary for an understanding of the cryptographic methods. It includes proofs 
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and covers, for example, basics like Euclid’s algorithm and the Chinese Re- 
mainder Theorem, but also more advanced material like Legendre and Jacobi 
symbols and probabilistic prime number tests. The concepts and results from 
probability and information theory that are applied in the second part of the 
book are given in full in Appendix B. To keep the mathematics easy, we 
do not address elliptic curve cryptography. We illustrate the key concepts of 
public-key cryptography by the classical examples like RSA in the quotient 
rings Z, of the integers Z. 

The book starts with an introduction into classical symmetric encryption 
in Chapter 2. The principles of public-key cryptography and their use for 
encryption and digital signatures are discussed in detail in Chapter 3. The 
famous and widely used RSA, ElGamal’s methods and the digital signature 
standard, Rabin’s encryption and signature schemes serve as the outstand- 
ing examples. The underlying one-way functions — modular exponentiation, 
modular powers and modular squaring — are used throughout the book, also 
in the second part. 

Chapter 4 presents typical cryptographic protocols, including key ex- 
change, identification and commitment schemes, electronic cash and elec- 
tronic elections. 

The following chapters focus on a precise definition of the key concepts 
and the security of public-key cryptography. Attacks are modeled by prob- 
abilistic polynomial algorithms (Chapter 5). One-way functions as the basic 
building blocks and the security assumptions underlying modern public-key 
cryptography are studied in Chapter 6. In particular, the bit security of the 
RSA function, the discrete logarithm and the Rabin function is analyzed in 
detail (Chapter 7). The close relation between one-way functions and com- 
putationally perfect pseudorandom generators meeting the needs of cryptog- 
raphy is explained in Chapter 8. Provable security properties of encryption 
schemes are the central topic of Chapter 9. It is clarified that randomness is 
the key to security. We start with the classical notions of provable security 
originating from Shannon’s work on information theory. Typical examples 
of more recent results on the security of public-key encryption schemes are 
given, taking into account the computational complexity of attacking algo- 
rithms. A short introduction to cryptosystems, whose security can be proven 
by information-theoretic methods without any assumptions on the hardness 
of computational problems (“unconditional security approach” ), supplements 
the section. Finally, we discuss in Chapter 10 the levels of security of dig- 
ital signatures and give examples of signature schemes, whose security can 
be proven solely under standard assumptions like the factoring assumption, 
including a typical security proof. 


Each chapter (except Chapter 1) closes with a collection of exercises. 
Answers to the exercises are provided on the Web page for this book: 
www.informatik.fh-nuernberg.de/DelfsKnebl/Cryptography. 


Preface IX 


We thank our colleagues and students for pointing out errors and sug- 
gesting improvements. In particular, we express our thanks to Jorg Schwenk, 
Harald Stieber and Rainer Weber. We are grateful to Jimmy Upton for 
his comments and suggestions, and we are very much indebted to Patricia 
Shiroma-Brockmann for proof-reading the English copy. Finally, we would 
like to thank Alfred Hofmann at Springer-Verlag for his support during the 
writing and publication of this book. 


Niirnberg, December 2001 Hans Delfs, Helmut Knebl 
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1. Introduction 


Cryptography is the science of keeping secrets secret. Assume a sender re- 
ferred to here and in what follows as Alice (as is commonly used) wants to 
send a message m to a receiver referred to as Bob. She uses an insecure com- 
munication channel. For example, the channel could be a computer network 
or a telephone line. There is a problem if the message contains confidential 
information. The message could be intercepted and read by an eavesdropper. 
Or, even worse, the adversary, as usual referred to here as Eve, might be able 
to modify the message during transmission in such a way that the legitimate 
recipient Bob does not detect the manipulation. 

One objective of cryptography is to provide methods for preventing such 
attacks. Other objectives are discussed in Section 1.2. 


1.1 Encryption and Secrecy 


The fundamental and classical task of cryptography is to provide confidential- 
ity by encryption methods. The message to be transmitted — it can be some 
text, numerical data, an executable program or any other kind of information 
— is called the plaintext. Alice encrypts the plaintext m and obtains the ci- 
phertezt c. The ciphertext c is transmitted to Bob. Bob turns the ciphertext 
back into the plaintext by decryption. To decrypt, Bob needs some secret 
information, a secret decryption key.’ Adversary Eve still may intercept the 
ciphertext. However, the encryption should guarantee secrecy and prevent 
her from deriving any information about the plaintext from the observed 
ciphertext. 

Encryption is very old. For example, Caesar’s shift cipher? was introduced 
more than 2000 years ago. Every encryption method provides an encryption 
algorithm F and a decryption algorithm D. In classical encryption schemes, 
both algorithms depend on the same secret key k. This key k is used for both 
encryption and decryption. These encryption methods are therefore called 


' Sometimes the terms encipher and decipher are used instead of encrypt and 
decrypt. 

? Each plaintext character is replaced by the character 3 to the right modulo 26, 
i.e., a is replaced by d, b by e,..., x by a, y by b and z by c. 
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symmetric. For example, in Caesar’s cipher the secret key is the offset 3 of 
the shift. We have 


D(k, E(k,m)) = m for each plaintext m. 


Symmetric encryption and the important examples DES (data encryption 
standard) and AES (advanced encryption standard) are discussed in Chap- 
ter 2. 

In 1976, W. Diffie and M.E. Hellman published their famous paper, New 
Directions in Cryptography ({DifHel76]). There they introduced the revo- 
lutionary concept of public-key cryptography. They provided a solution to 
the long standing problem of key exchange and pointed the way to digital 
signatures. The public-key encryption methods (comprehensively studied in 
Chapter 3) are asymmetric. Each recipient of messages has his personal key 
k = (pk, sk), consisting of two parts: pk is the encryption key and is made 
public, sk is the decryption key and is kept secret. If Alice wants to send a 
message m to Bob, she encrypts m by use of Bob’s publicly known encryption 
key pk. Bob decrypts the ciphertext by use of his decryption key sk, which 
is known only to him. We have 


D(sk, E(pk,m)) = m. 


Mathematically speaking, public-key encryption is a so-called one-way 
function with a trapdoor. Everyone can easily encrypt a plaintext using the 
public key pk, but the other direction is difficult. It is practically impossible 
to deduce the plaintext from the ciphertext, without knowing the secret key 
sk (which is called the trapdoor information). 

Public-key encryption methods require more complex computations and 
are less efficient than classical symmetric methods. Thus symmetric methods 
are used for the encryption of large amounts of data. Before applying sym- 
metric encryption, Alice and Bob have to agree on a key. To keep this key 
secret, they need a secure communication channel. It is common practice to 
use public-key encryption for this purpose. 


1.2 The Objectives of Cryptography 


Providing confidentiality is not the only objective of cryptography. Cryptog- 
raphy is also used to provide solutions for other problems: 


1. Data integrity. The receiver of a message should be able to check whether 
the message was modified during transmission, either accidentally or de- 
liberately. No one should be able to substitute a false message for the 
original message, or for parts of it. 

2. Authentication. The receiver of a message should be able to verify its 
origin. No one should be able to send a message to Bob and pretend to 
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be Alice (data origin authentication). When initiating a communication, 
Alice and Bob should be able to identify each other (entity authentica- 
tion). 

3. Non-repudiation. The sender should not be able to later deny that she 
sent a message. 


If messages are written on paper, the medium — paper — provides a certain se- 
curity against manipulation. Handwritten personal signatures are intended to 
guarantee authentication and non-repudiation. If electronic media are used, 
the medium itself provides no security at all, since it is easy to replace some 
bytes in a message during its transmission over a computer network, and it 
is particularly easy if the network is publicly accessible, like the Internet. 

So, while encryption has a long history,? the need for techniques provid- 
ing data integrity and authentication resulted from the rapidly increasing 
significance of electronic communication. 

There are symmetric as well as public-key methods to ensure the integrity 
of messages. Classical symmetric methods require a secret key k that is shared 
by sender and receiver. The message m is augmented by a message authenti- 
cation code (MAC). The code is generated by an algorithm and depends on 
the secret key. The augmented message (m, MAC (k, m)) is protected against 
modifications. The receiver may test the integrity of an incoming message 
(m,™m) by checking whether 


MAC(k,m) =™. 


Message authentication codes may be implemented by keyed hash functions 
(see Chapter 3). 

Digital signatures require public-key methods (see Chapter 3 for examples 
and details). As with classical handwritten signatures, they are intended to 
provide authentication and non-repudiation. Note that non-repudiation is an 
indispensable feature if digital signatures are used to sign contracts. Digital 
signatures depend on the secret key of the signer — they can be generated only 
by him. On the other hand, anyone can check whether a signature is valid, 
by applying a publicly known verification algorithm Verify, which depends 
on the public key of the signer. If Alice wants to sign the message m, she 
applies the algorithm Sign with her secret key sk and gets the signature 
Sign(sk,m). Bob receives a signature s for message m, and may then check 
the signature by testing whether 


Verify (pk, s,m) = ok, 


with Alice’s public key pk. 
It is common not to sign the message itself, but to apply a cryptographic 
hash function (see Section 3.4) first and then sign the hash value. In schemes 


3 For the long history of cryptography, see [Kahn67]. 
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like the famous RSA (named after its inventors: Rivest, Shamir and Adle- 
man), the decryption algorithm is used to generate signatures and the encryp- 
tion algorithm is used to verify them. This approach to digital signatures is 
therefore often referred to as the “hash-then-decrypt” paradigm (see Section 
3.4.5 for details). More sophisticated signature schemes, like the probabilis- 
tic signature scheme (PSS), require more steps. Modifying the hash value 
by pseudorandom sequences turns signing into a probabilistic procedure (see 
Section 3.4.5). 

Digital signatures depend on the message. Distinct messages yield dif- 
ferent signatures. Thus, like classical message authentication codes, digital 
signatures can also be used to guarantee the integrity of messages. 


1.3 Attacks 


The primary goal of cryptography is to keep the plaintext secret from eaves- 
droppers trying to get some information about the plaintext. As discussed 
before, adversaries may also be active and try to modify the message. Then, 
cryptography is expected to guarantee the integrity of the messages. Adver- 
saries are assumed to have complete access to the communication channel. 

Cryptanalysis is the science of studying attacks against cryptographic 
schemes. Successful attacks may, for example, recover the plaintext (or parts 
of the plaintext) from the ciphertext, substitute parts of the original mes- 
sage, or forge digital signatures. Cryptography and cryptanalysis are often 
subsumed by the more general term cryptology. 

A fundamental assumption in cryptanalysis was first stated by A. Kerkhoff 
in the nineteenth century. It is usually referred to as Kerkhoff’s Principle. It 
states that the adversary knows all the details of the cryptosystem, includ- 
ing algorithms and their implementations. According to this principle, the 
security of a cryptosystem must be entirely based on the secret keys. 

Attacks on the secrecy of an encryption scheme try to recover plaintexts 
from ciphertexts, or even more drastically, to recover the secret key. The fol- 
lowing survey is restricted to passive attacks. The adversary, as usual we call 
her Eve, does not try to modify the messages. She monitors the communica- 
tion channel and the end points of the channel. So she may not only intercept 
the ciphertext, but (at least from time to time) she may be able to observe 
the encryption and decryption of messages. She has no information about 
the key. For example, Eve might be the operator of a bank computer. She 
sees incoming ciphertexts and sometimes also the corresponding plaintexts. 
Or she observes the outgoing plaintexts and the generated ciphertexts. Per- 
haps she manages to let encrypt plaintexts or decrypt ciphertexts of her own 
choice. 

The possible attacks depend on the actual resources of the adversary Eve. 
They are usually classified as follows: 
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1. Ciphertezt-only attack. Eve has the ability to obtain ciphertexts. This 
is likely to be the case in any encryption situation. Even if Eve cannot 
perform the more sophisticated attacks described below, one must assume 
that she can get access to encrypted messages. An encryption method 
that cannot resist a ciphertext-only attack is completely insecure. 

2. Known-plaintext attack. Eve has the ability to obtain plaintext-ciphertext 
pairs. Using the information from these pairs, she attempts to decrypt a 
ciphertext for which she does not have the plaintext. At first glance, it 
might appear that such information would not ordinarily be available to 
an attacker. However, it very often is available. Messages may be sent in 
standard formats which Eve knows. 

3. Chosen-plaintext attack. Eve has the ability to obtain ciphertexts for 
plaintexts of her choosing. Then she attempts to decrypt a ciphertext 
for which she does not have the plaintext. While again this may seem 
unlikely, there are many cases in which Eve can do just this. For example, 
she sends some interesting information to her intended victim which she 
is confident he will encrypt and send out. This type of attack assumes 
that Eve must first obtain whatever plaintext-ciphertext pairs she wants 
and then do her analysis, without any further interaction. This means 
that she only needs access to the encrypting device once. 

4. Adaptively-chosen-plaintext attack. This is the same as the previous at- 
tack, except now Eve may do some analysis on the plaintext-ciphertext 
pairs, and subsequently get more pairs. She may switch between gather- 
ing pairs and performing the analysis as often as she likes. This means 
that she has either lengthy access to the encrypting device or can some- 
how make repeated use of it. 

5. Chosen- and adaptively-chosen-ciphertext attack. These two attacks are 
similar to the above plaintext attacks. Eve can choose ciphertexts and 
gets the corresponding plaintexts. She has access to the decryption de- 
vice. 


1.4 Cryptographic Protocols 


Encryption and decryption algorithms, cryptographic hash functions or 
pseudorandom generators (see Section 2.1, Chapter 8) are the basic building 
blocks (also called cryptographic primitives) for solving problems involving 
secrecy, authentication or data integrity. 

In many cases a single building block is not sufficient to solve the given 
problem: different primitives must be combined. A series of steps must be 
executed to accomplish a given task. Such a well-defined series of steps is 
called a cryptographic protocol. As is also common, we add another condition: 
we require that two or more parties are involved. We only use the term 
protocol if at least two people are required to complete the task. 
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As a counter example, take a look at digital signature schemes. A typical 
scheme for generating a digital signature first applies a cryptographic hash 
function h to the message m and then, in a second step, computes the signa- 
ture by applying a public-key decryption algorithm to the hash value h(m). 
Both steps are done by one person. Thus, we do not call it a protocol. 

Typical examples of protocols are protocols for user identification. There 
are many situations where the identity of a user Alice has to be verified. 
Alice wants to log in to a remote computer, for example, or to get access 
to an account for electronic banking. Passwords or PIN numbers are used 
for this purpose. This method is not always secure. For example, anyone 
who observes Alice’s password or PIN when transmitted might be able to 
impersonate her. We sketch a simple challenge-and-response protocol which 
prevents this attack (however, it is not perfect; see Section 4.2.1). 

The protocol is based on a public-key signature scheme, and we assume 
that Alice has a key k = (pk, sk) for this scheme. Now, Alice can prove her 
identity to Bob in the following way. 


1. Bob randomly chooses a “challenge” c and sends it to Alice. 

2. Alice signs c with her secret key, s := Sign(sk,c), and sends the 
sponse” s to Bob. 

3. Bob accepts Alice’s proof of identity, if Verify(pk, s,c) = ok. 


“re 


Only Alice can return a valid signature of the challenge c, because only she 
knows the secret key sk. Thus, Alice proves her identity, without showing her 
secret. No one can observe Alice’s secret key, not even the verifier Bob. 

Suppose that an eavesdropper Eve observed the exchanged messages. 
Later, she wants to impersonate Alice. Since Bob selects his challenge c at 
random (from a huge set), the probability that he uses the same challenge 
twice is very small. Therefore, Eve cannot gain any advantage by her obser- 
vations. 

The parties in a protocol can be friends or adversaries. Protocols can be 
attacked. The attacks may be directed against the underlying cryptographic 
algorithms or against the implementation of the algorithms and protocols. 
There may also be attacks against a protocol itself. There may be passive 
attacks performed by an eavesdropper, where the only purpose is to obtain 
information. An adversary may also try to gain an advantage by actively 
manipulating the protocol. She might pretend to be someone else, substitute 
messages or replay old messages. 

Important protocols for key exchange, electronic elections, digital cash 
and interactive proofs of identity are discussed in Chapter 4. 


1.5 Provable Security 


It is desirable to design cryptosystems that are provably secure. Provably se- 
cure means that mathematical proofs show that the cryptosystem resists cer- 
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tain types of attacks. Pioneering work in this field was done by C.E. Shannon. 
In his information theory, he developed measures for the amount of informa- 
tion associated with a message and the notion of perfect secrecy. A perfectly 
secret cipher perfectly resists all ciphertext-only attacks. An adversary gets 
no information at all about the plaintext, even if his resources in comput- 
ing power and time are unlimited. Vernam’s one-time pad (see Section 2.1), 
which encrypts a message m by XORing it bitwise with a truly random bit 
string, is the most famous perfectly secret cipher. It even resists all the pas- 
sive attacks mentioned. This can be mathematically proven by Shannon’s 
theory. Classical information-theoretic security is discussed in Section 9.1; 
an introduction to Shannon’s information theory may be found in Appendix 
B. Unfortunately, Vernam’s one-time pad and all perfectly secret ciphers are 
usually impractical. It is not practical in most situations to generate and 
handle truly random bit sequences of sufficient length as required for perfect 
secrecy. 

More recent approaches to provable security therefore abandon the ideal 
of perfect secrecy and the (unrealistic) assumption of unbounded computing 
power. The computational complexity of algorithms is taken into account. 
Only attacks that might be feasible in practice are considered. Feasible means 
that the attack can be performed by an efficient algorithm. Of course, here 
the question about the right notion of efficiency arises. Certainly, algorithms 
with non-polynomial running time are inefficient. Vice versa algorithms with 
polynomial running time are often considered as the efficient ones. In this 
book, we also adopt this notion of efficiency. 

The way a cryptographic scheme is attacked might be influenced by ran- 
dom events. Adversary Eve might toss a coin to decide which case she tries 
next. Therefore, probabilistic algorithms are used to model attackers. Break- 
ing an encryption system, for example by a ciphertext-only attack, means that 
a probabilistic algorithm with polynomial running time manages to derive in- 
formation about the plaintext from the ciphertext, with some non-negligible 
probability. Probabilistic algorithms can toss coins, and their control flow 
may be at least partially directed by these random events. By using random 
sources, they can be implemented in practice. They must not be confused 
with non-deterministic algorithms. The notion of probabilistic (polynomial) 
algorithms and the underlying probabilistic model are discussed in Chap- 
ter 5. 

The security of a public-key cryptosystem is based on the hardness of 
some computational problem (there is no efficient algorithm for solving the 
problem). For example, the secret keys of an RSA scheme could be easily 
figured out if computing the prime factors of a large integer were possible.* 


4 What “large” means depends on the available computing power. Today, a 1024- 
bit integer is considered as large. 
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However, it is believed that factoring large integers is infeasible.° There are 
no mathematical proofs for the hardness of the computational problems used 
in public-key systems. Therefore, security proofs for public-key methods are 
always conditional: they depend on the validity of the underlying assumption. 

The assumption usually states that a certain function f is one way; ie., f 
can be computed efficiently, but it is infeasible to compute x from f(x). The 
assumptions, as well as the notion of a one-way function, can be made very 
precise by the use of probabilistic polynomial algorithms. The probability of 
successfully inverting the function by a probabilistic polynomial algorithm 
is negligibly small, and negligibly small means that it is asymptotically less 
than any given polynomial bound (see Chapter 6, Definition 6.12). Important 
examples, like the factoring, discrete logarithm and quadratic residuosity 
assumptions, are included in this book (see Chapter 6). 

There are analogies to the classical notions of security. Shannon’s perfect 
secrecy has a computational analogy: ciphertext indistinguishability (or se- 
mantic security). An encryption is perfectly secret if and only if an adversary 
cannot distinguish between two plaintexts, even if her computing resources 
are unlimited: if adversary Eve knows that a ciphertext c is the encryption of 
either m or m’, she has no better chance than 1/2 of choosing the right one. 
Ciphertext indistinguishability — also called polynomial-time indistinguisha- 
bility — means that Eve’s chance of successfully applying a probabilistic poly- 
nomial algorithm is at most negligibly greater than 1/2 (Chapter 9, Definition 
9.14). 

As a typical result, it is proven in Section 9.4 that public-key one-time 
pads are ciphertext-indistinguishable. This means, for example, that the RSA 
public-key one-time pad is ciphertext-indistinguishable under the sole as- 
sumption that the RSA function is one way. A public-key one-time pad is 
similar to Vernam’s one-time pad. The difference is that the message m is 
XORed with a pseudorandom bit sequence which is generated from a short 
truly random seed, by means of a one-way function. 

Thus, one-way functions are not only the essential ingredients of public- 
key encryption and digital signatures. They also yield computationally perfect 
pseudorandom bit generators (Chapter 8). If f is a one-way function, it is not 
only impossible to compute x from f(x), but certain bits (called hard-core 
bits) of a are equally difficult to deduce. This feature is called the bit security 
of a one-way function. For example, the least-significant bit is a hard-core bit 
for the RSA function « + x* mod n. Starting with a truly random seed, 
repeatedly applying f and taking the hard-core bit in each step, you get 
a pseudorandom bit sequence. These bit sequences cannot be distinguished 
from truly random bit sequences by an efficient algorithm, or, equivalently 
(Yao’s Theorem, Section 8.2), it is practically impossible to predict the next 
bit from the previous ones. So they are really computationally perfect. 


° It is not known whether breaking RSA is easier than factoring the modulus. See 
Chapters 3 and 6 for a detailed discussion. 
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The bit security of important one-way functions is studied in detail in 
Chapter 7 including an in-depth analysis of the probabilities involved. 

Randomness and the security of cryptographic schemes are closely related. 
There is no security without randomness. An encryption method provides se- 
crecy only if the ciphertexts appear random to the adversary Eve. Vernam’s 
one-time pad is perfectly secret, because, due to the truly random key string 
k, the encrypted message m @ k © is a truly random bit sequence for Eve. 
The public-key one-time pad is ciphertext-indistinguishable, because if Eve 
applies an efficient probabilistic algorithm, she cannot distinguish the pseudo- 
random key string and, as a consequence, the ciphertext from a truly random 
sequence. 

Public-key one-time pads are secure against passive eavesdroppers, who 
perform a ciphertext-only attack (see Section 1.3 above for a classification 
of attacks). However, active adversaries, who perform adaptively-chosen- 
ciphertext attacks, can be a real danger in practice — as demonstrated by Ble- 
ichenbacher’s 1-Million-Chosen-Ciphertext Attack (Section 3.3.3). Therefore, 
security against such attacks is also desirable. In Section 9.5, we study two ex- 
amples of public-key encryption schemes which are secure against adaptively- 
chosen-ciphertext attacks, and their security proofs. One of the examples, 
Cramer-Shoup’s public key encryption scheme, was the first practical scheme 
whose security proof is based solely on a standard number-theoretic assump- 
tion and a standard assumption of hash functions (collision-resistance). 

The ideal cryptographic hash function is a random function. It yields hash 
values which cannot be distinguished from randomly selected and uniformly 
distributed values. Such a random function is also called a random oracle. 
Sometimes, the security of a cryptographic scheme can be proven in the 
random oracle model. In addition to the assumed hardness of a computational 
problem, such a proof relies on the assumption that the hash functions used 
in the scheme are truly random functions. Examples of such schemes include 
the public-key encryption schemes OAEP (Section 3.3.4) and SAEP (Section 
9.5.1), the above mentioned signature scheme PSS and full-domain-hash RSA 
signatures (Section 3.4.5). We give the random-oracle proofs for SAEP and 
full-domain-hash signatures. 

Truly random functions can not be implemented, nor even perfectly ap- 
proximated in practice. Therefore, a proof in the random oracle model can 
never be a complete security proof. The hash functions used in practice are 
constructed to be good approximations to the ideal of random functions. 
However, there were surprising errors in the past (see Section 3.4). 

We distinguished different types of attacks on an encryption scheme. In a 
similar way, the attacks on signature schemes can be classified and different 
levels of security can be defined. We introduce this classification in Chap- 
ter 10 and give examples of signature schemes whose security can be proven 
solely under standard assumptions (like the factoring or the strong RSA as- 


& © denotes the bitwise XOR operator, see page 13. 
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sumption). No assumptions on the randomness of a hash function have to be 
made, in contrast, for example, to schemes like PSS. A typical security proof 
for the highest level of security is included. For the given signature scheme, 
we show that not a single signature can be forged, even if the attacker Eve 
is able to obtain valid signatures from the legitimate signer, for messages she 
has chosen adaptively. 

The security proofs for public-key systems are always conditional and de- 
pend on (widely believed, but unproven) assumptions. On the other hand, 
Shannon’s notion of perfect secrecy and, in particular, the perfect secrecy 
of Vernam’s one-time pad are unconditional. Although perfect unconditional 
security is not reachable in most practical situations, there are promising at- 
tempts to design practical cryptosystems which provably come close to perfect 
information-theoretic security. The proofs are based on classical information- 
theoretic methods and do not depend on unproven assumptions. The security 
relies on the fact that communication channels are noisy or on the limited 
storage capacity of an adversary. Some results in this approach are reviewed 
in the chapter on provably secure encryption (Section 9.6). 


2. Symmetric-Key Encryption 


In this chapter, we give an introduction to symmetric-key encryption. We 
explain the notions of stream and block ciphers. The operation modes of 
block ciphers are studied and, as prominent examples for block ciphers, DES 
and AES are described. 

Symmetric-key encryption provides secrecy when two parties, say Alice 
and Bob, communicate. An adversary who intercepts a message should not 
get any significant information about its content. 

To set up a secure communication channel, Alice and Bob first agree on 
a key k. They keep their shared key k& secret. Before sending a message m 
to Bob, Alice encrypts m by using the encryption algorithm F and the key 
k. She obtains the ciphertext c = E(k,m) and sends c to Bob. By using the 
decryption algorithm D and the same key k, Bob decrypts c to recover the 
plaintext m = D(k, c). 

We speak of symmetric encryption, because both communication part- 
ners use the same key k for encryption and decryption. The encryption and 
decryption algorithms EF and D are publicly known. Anyone can decrypt a 
ciphertext, if he or she knows the key. Thus, the key k has to be kept secret. 

A basic problem in a symmetric scheme is how Alice and Bob can agree 
on a shared secret key k in a secure and efficient way. For this key exchange, 
the methods of public-key cryptography are needed, which we discuss in the 
subsequent chapters. There were no solutions to the key exchange problem, 
until the revolutionary concept of public-key cryptography was discovered 30 
years ago. 

We require that the encrypted plaintext m can be uniquely recovered 
from the ciphertext c. This means that for a fixed key k, the encryption map 
must be bijective. Mathematically, symmetric encryption may be considered 
as follows. 


Definition 2.1. A symmetric-key encryption scheme consists of a map 
E:KxM—-C, 
such that for each k € K, the map 


Ey: M — C, m+=> E(k,m) 
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is invertible. The elements m € M are the plaintexts (also called messages). 
C is the set of ciphertexts or cryptograms, the elements k € K are the keys. 
E;, is called the encryption function with respect to the key k. The inverse 
function D, := Hg > is called the decryption function. It is assumed that 
efficient algorithms to compute FE, and Dy exist. 


The key k is shared between the communication partners and kept se- 
cret. A basic security requirement for the encryption map F is that, without 
knowing the key k, it should be impossible to successfully execute the decryp- 
tion function D;,. Important examples of symmetric-key encryption schemes 
— Vernam’s one-time pad, DES and AES ~ are given below. 

Among all encryption algorithms, symmetric-key encryption algorithms 
have the fastest implementations in hardware and software. Therefore, they 
are very well-suited to the encryption of large amounts of data. If Alice and 
Bob want to use a symmetric-key encryption scheme, they first have to ex- 
change a secret key. For this, they have to use a secure communication chan- 
nel. Public-key encryption methods, which we study in Chapter 3, are often 
used for this purpose. Public-key encryption schemes are less efficient and 
hence not suitable for large amounts of data. Thus, symmetric-key encryp- 
tion and public-key encryption complement each other to provide practical 
cryptosystems. 

We distinguish between block ciphers and stream ciphers. The encryption 
function of a block cipher processes plaintexts of fixed length. A stream ci- 
pher operates on streams of plaintext. Processing character by character, it 
encrypts plaintext strings of arbitrary length. If the plaintext length exceeds 
the block length of a block cipher, various modes of operation are used. Some 
of them yield stream ciphers. Thus, block ciphers may also be regarded as 
building blocks for stream ciphers. 


2.1 Stream Ciphers 


Definition 2.2. Let Kk be a set of keys and M be a set of plaintexts. In this 
context, the elements of M are called characters. 
A stream cipher 


E*: K* x M* —> C*, E*(k,m) := c:= c1€9¢3... 


encrypts a stream m := m,imgm3... € M* of plaintext characters m; © M 
as a stream ¢ := c,C9c3... € C* of ciphertext characters c; € C by using a 
key stream k := kykgk3...€ K*,k, € K. 

The plaintext stream m = mimgm3... is encrypted character by character. 
For this purpose, there is an encryption map 


E:KxM—4C, 
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which encrypts the single plaintext characters m; with the corresponding key 
character k;: 
q= Ex, (mi) = E(ki,m;),% = 12; re 


Typically, the characters in M and C' and the key elements in K are binary 
digits or bytes. 


Of course, encrypting plaintext characters with E;,, must be a bijective 
map, for every key character k; € K. Decrypting a ciphertext stream c := 
c1¢2Cc3... is done character by character by applying the decryption map D 
with the same key stream k = ki kok3... that was used for encryption: 


c= c1C9¢3... D(k,c) = Dp, (1) Dey (C2)Drg (eg) ---- 


A necessity for stream ciphers comes, for example, from operating sys- 
tems, where input and output is done with so-called streams. 

Of course, the key stream in a stream cipher has to be kept secret. It is 
not necessarily the secret key which is shared between the communication 
partners; the key stream might be generated from the shared secret key by a 
pseudorandom generator (see below). 


Notation. In most stream ciphers, the binary exclusive-or operator XOR 
for bits a,b € {0,1} — considered as truth values — is applied. We have 
a XOR b = 1, if a = 0 and b=1 ora =1 and b = 0, and a XOR bd = 0, 
ifa=b=0o0ra=b=1. XORing two bits a and b means to add them 
modulo 2, i.e., we have a XOR b = a+ 6 mod 2. As is common practice, we 
denote the XOR-operator by 6, a @ b := a XOR Bb, and we use © also for 
the binary operator that bitwise XORs two bit strings. If a = a,a2...a, and 
b= b1bo...b,, are bit strings, then 


a@b:= (a, XOR b1)(az XOR bo)... (dn XOR b,). 


Vernam’s One-Time Pad. The most famous example of a stream ci- 
pher is Vernam’s one-time pad (see [Vernam19] and [Vernam26]). It is 
easy to describe. Plaintexts, keys and ciphertexts are bit strings. To en- 
crypt a message m := mymgm3..., where m; € {0,1}, a key stream 
k := kykgk3..., with kj € {0,1}, is needed. Encryption and decryption are 
given by bitwise XORing with the key stream: 


E*(k,m):=k@®m and D*(k,c) :=k @c. 


Obviously, encryption and decryption are inverses of each other. Each bit 
in the key stream is chosen at random and independently, and the key stream 
is used only for the encryption of one message m. This fact explains the name 
“one-time pad”. If a key stream & were used twice to encrypt m and ™, we 
could derive m@m from the cryptograms c and € and thus obtain information 
about the plaintexts, by computing cht =mOokOemek=mMmomekGk= 
mom. 
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There are obvious disadvantages to Vernam’s one-time pad. Truly random 
keys of the same length as the message have to be generated and securely 
transmitted to the recipient. There are very few situations where this is prac- 
tical. Reportedly, the hotline between Washington and Moscow was encrypted 
with a one-time pad; the keys were transported by a trusted courier. 

Nevertheless, most practical stream ciphers work as Vernam’s one-time 
pad. The difference is that a pseudorandom key stream is taken instead of the 
truly random key stream. A pseudorandom key stream looks like a random 
key stream, but actually the bits are generated from a short (truly) random 
seed by a deterministic algorithm. In practice, such pseudorandom generators 
can be based on specific operation modes of block ciphers or on feedback shift 
registers. We study the operation modes of block ciphers in Section 2.2.3 (e.g. 
see the cipher and output feedback modes). Feedback shift registers can be 
implemented to run very fast on relatively simple hardware. This fact makes 
them especially attractive. More about these generators and stream ciphers 
can be found, for example, in [MenOorVan96]. There are also public-key 
stream ciphers, in which the pseudorandom key stream is generated by using 
public-key techniques. We discuss these pseudorandom bit generators and the 
resulting stream ciphers in detail in Chapters 8 and 9. 

Back to Vernam’s one-time pad. Its advantage is that one can prove that 
it is secure — an adversary observing a cryptogram does not have the slightest 
idea what the plaintext is. We discuss this point in the simplest case, where 
the message m consists of a single bit. Alice and Bob want to exchange one 
of the messages yes = 1 or no = 0. Previously, they exchanged the key bit k, 
which was the outcome of an unbiased coin toss. 

First, we assume that each of the two messages yes and no is equally 
likely. The adversary, we call her Eve, intercepts the cryptogram c. Since 
the key bit is truly random, Eve can only derive that c encrypts yes or no 
with probability 1/9. Thus, she has not the slightest idea which of the two is 
encrypted. Her only chance of making a decision is to toss a coin. She can do 
this, however, without seeing the cryptogram c. 

If one of the two messages has a greater probability, Eve also cannot 
gain any advantage by intercepting the cryptogram c. Assume, for example, 
that the probability of a 1 is 3/4 and the probability of a 0 is 1/4. Then 
the cryptogram c encrypts 0 with probability 1/4 and 1 with probability 3/4, 
irrespective of whether c = 0 or c = 1. Thus, Eve cannot learn more from 
the cryptogram than she has learned a priori about the distribution of the 
plaintexts. 

Our discussion for one-bit messages may be transferred to the general 
case of n-bit messages. The amount of information an attacker may obtain is 
made precise by information theory. The level of security we achieve with the 
one-time pad is called perfect secrecy (see Chapter 9 for details). Note that we 
have to assume that all messages have the same length n (if necessary, they 
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are padded out). Otherwise, some information — the length of the message — 
would leak to the attacker. 

The Vernam one-time pad not only resists a ciphertext-only attack as 
proven formally in Chapter 9, but it resists all the attacks defined in Chap- 
ter 1. Each cryptogram has the same probability. Eve does not learn anything, 
not even about the probabilities of the plaintexts, if she does not know them 
a priori. For each message, the key is chosen at random and independently 
from the previous ones. Thus, Eve cannot gain any advantage by observing 
plaintext-ciphertext pairs, not even if she has chosen the plaintexts adap- 
tively. 

The Vernam one-time pad ensures the confidentiality of messages, but it 
does not protect messages against modifications. If someone changes bits in 
the cryptogram and the decrypted cryptogram makes sense, the receiver will 
not notice it. 


2.2 Block Ciphers 


Definition 2.3. A block cipher is a symmetric-key encryption scheme with 
M =C = {0,1}” and key space K = {0,1}: 


E: {0,1}" x {0,1}" — {0,1}", (k,m) +> E(k,m). 


Using a secret key & of binary length r, the encryption algorithm E encrypts 
plaintext blocks m of a fixed binary length n and the resulting ciphertext 
blocks c = E(k,m) also have length n. n is called the block length of the 
cipher. 


Typical block lengths are 64 (as in DES) or 128 (as in AES), typical key 
lengths are 56 (as in DES) or 128, 192 and 256 (as in AES). 

Let us consider a block cipher EF with block length n and key length r. 
There are 2” plaintext blocks and 2” ciphertext blocks of length n. For a fixed 
key k, the encryption function E, :m +> E(k,m) maps {0,1}” bijectively 
to {0,1}" — it is a permutation! of {0,1}". Thus, to choose a key k, means 
to select a permutation FE, of {0,1}", and this permutation is then used 
to encrypt the plaintext blocks. The 2” permutations Ey, with k running 
through the set {0,1}" of keys, form an almost negligibly small subset in the 
tremendously large set of all permutations of {0,1}", which consists of 2”! 
elements. So, when we randomly choose an r-bit key & for E, then we restrict 
our selection of the encryption permutation to an extremely small subset. 

From these considerations, we conclude that we cannot have the ideal 
block cipher with perfect secrecy in practice. Namely, in the preceding Section 
2.1, we discussed a stream cipher with perfect secrecy, the Vernam one-time 
pad. Perfect secrecy results from a maximum amount of randomness: for each 


' A map f : D — D is called a permutation of D, if f is bijective. 
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message bit, a random key bit is chosen (we will prove in Chapter 9 that less 
randomness in key generation destroys perfect secrecy, see Theorem 9.6). We 
conclude that the maximal level of security in a block cipher also requires a 
maximum of randomness, and this in turn means that — when choosing a key 
— we would have to select a random element from the set of all permutations 
of {0,1}”". Unfortunately, this turns out to be completely impractical. We 
could try to enumerate all permutations z of {0,1}", 7,72, 73,..., and then 
randomly select one by randomly selecting an index (this index would be 
the key). Since there are 2”! permutations, we need log,(2”!)-bit numbers to 
encode the indexes. By Stirling’s approximation formula k! ~ V2ak (H/e)*, 
we derive that log,(2”!) = (mn — 1.44)2”. This is a huge number. For a block 
length n of 64 bits, we would need approximately 2°’ bytes to store a single 
key. There is no storage medium with such capacity. 

Thus, in a real block cipher, we have to restrict ourselves to much smaller 
keys and choose the encryption permutation FE, for a key k from a much 
smaller set of 2” permutations, with r typically in the range of 56 to 256. 
Nevertheless, the designers of a block cipher try to approximate the ideal. 
The idea is to get an encryption function which behaves like a randomly 
chosen function from the very huge set of all permutations. 


2.2.1 DES 


The data encryption standard (DES), originally specified in [FIPS46 1977], 
was previously the most widely used symmetric-key encryption algorithm. 
Governments, banks and applications in commerce took the DES as the basis 
for secure and authentic communication. 

We give a high-level description of the DES encryption and decryption 
functions. The DES algorithm takes 56-bit keys and 64-bit plaintext messages 
as inputs and outputs a 64-bit cryptogram:? 


DES : {0,1}5° x {0,1}64 — {0, 1}% 
If the key k is chosen, we get 
DES, : {0,1}§* —> {0,1}§*, e+ DES(k, 2). 


An encryption with DES; consists of 16 major steps, called rounds. In 
each of the 16 rounds, a 48-bit round key k; is used. The 16 round keys 
ky,...,kig are computed from the 56-bit key & by using an algorithm which 
is studied in Exercise 1 at the end of this chapter. 

In the definition of DES, one of the basic building blocks is a map 


i {0, 1} x 10; ape? ——F {0, Le, 


? Actually, the 56 bits of the key are packed with 8 bits of parity. 
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which transforms a 32-bit message block « with a 48-bit round key k. £ is 
composed of a substitution S and a permutation P: 


f(k, x) = P(S(E(x) @k)). 


The 32 message bits are extended to 48 bits, 7 + E(«) (some of the 32 bits 
are used twice), and XORed with the 48-bit round key k. The resulting 48 
bits are divided into eight groups of 6 bits, and each group is substituted by 4 
bits. Thus, we get 32 bits which are then permuted by P. The cryptographic 
strength of the DES function depends on the design of f, especially on the 
design of the eight famous S-boxes which handle the eight substitutions (for 
details, see [FIPS46 1977]). 
We define for i = 1,...,16 


ob; : {0,1}9? x {0,1}8? — {0, 1}? x {0,1}82, (2, y) - (x @ f(ki, y), y). 


g; transforms 64-bit blocks and for this transformation, a 64-bit block is split 
into two 32-bit halves x and y. We have 


bio bi(x,y) = Gi(a © f(ki, y), y) = (x © f(ki, y) © f(ki, y), y) = (a, y)8 


Hence, ¢; is bijective and ¢;' = ¢;.4 The fact that ¢; is bijective does not 
depend on any properties of f. 
The DES, function is obtained by composing ¢1,...,@1¢ and the map 


ye: {0, 1}°? x {0, 1}9? —s {0,1}? x {0,1}9*, (2, y) > (y,2), 


which interchanges the left and the right half of a 64-bit block (2, y). 
Namely, 
DES,, : {0,1}54 —> {0,1}, 


DES;(a) := IP~1 0 $160 10 b15 0... 0 G20 [0 Gy oIP(zx). 


Here, IP is a publicly known permutation without cryptographic significance. 

We see that a DES cryptogram is obtained by 16 encryptions of the same 
type using 16 different round keys that are derived from the original 56- 
bit key. ¢; is called the encryption of round 7. After each round, except 
the last one, the left and the right half of the argument are interchanged. 
A block cipher that is computed by iteratively applying a round function 
to the plaintext is called an iterated cipher. If the round function has the 
form of the DES round function ¢;, the cipher is called a Feistel cipher. H. 
Feistel developed the Lucifer algorithm, which was a predecessor of the DES 
algorithm. The idea of using an alternating sequence of permutations and 
substitutions to get an iterated cipher can be attributed to C.E. Shannon 
(see [Shannon49]). 


3 goh denotes the composition of maps: go h(x) := g(h()). 
4 As usual, if f : D — R is a bijective map, we denote the inverse map by f~’. 
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Notation. We also write DES,;,..4,, for DES, to indicate that the round 
keys, derived from k, are used in this order for encryption. 

The following Proposition 2.4 means that the DES encryption function 
may also be used for decryption. For decryption, the round keys k, ...kig are 
supplied in reverse order. 


Proposition 2.4. For all messages x € {0,1}% 


DESky¢...k1 (DESk, ..kis (x)) =X. 


In other words, 
DESiy6...k: 0 DESg,..kyg = id. 
Proof. Since ¢; = db; (see above) and, obviously, = 1, we get 


DESk,,...k; 0 DES, ky, 
=IP tog, opodgo...podigoIPoIP 10 digo po di5 0... 0G, OIP 
= id. 


This proves the proposition. 


Shortly after DES was published, W. Diffie and M.E. Hellman criti- 
cized the short key size of 56 bits in [DifHel77]. They suggested using 
DES in multiple encryption mode. In triple encryption mode with three 
independent 56-bit keys ki, ke and kg, the cryptogram c is computed by 
DES,;, (DES,, (DES;, (m))). This can strengthen the DES because the set of 
DES, functions is not a group (i.e., DES;, oDES;, is not a DES; function), a 
fact which was shown in [CamWice92]. Moreover, it is shown there that 102499 
is a lower bound for the size of the subgroup generated by the DES; functions 
in the symmetric group. A small order of this subgroup would imply a less 
secure multiple encryption mode. 

The DES algorithm is well-studied and a lot of cryptanalysis has been 
performed. Special methods like linear and differential cryptanalysis have 
been developed and applied to attempt to break DES. However, the best 
practical attack known is an exhaustive key search. Assume some plaintext- 
ciphertext pairs (m;,c;), i = 1,...,n, are given. An exhaustive key search 
tries to find the key by testing DES(k,m;) = c;, i= 1,...,n, for all possible 
k € {0,1}°°. If such a k is found, the probability that k is really the key 
is very high. Special computers were proposed to perform an exhaustive key 
search (see [DifHel77]). Recently a specially designed supercomputer and a 
worldwide network of nearly 100 000 PCs on the Internet were able to find out 
the key after 22 hours and 15 minutes (see [RSALabs]). This effort recovered 
one key. This work would need to be repeated for each additional key to be 
recovered. 

The key size and the block size of DES have become too small to resist the 
progress in computer technology. The U.S. National Institute of Standards 
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and Technology (NIST) had standardized DES in the 1970s. After more than 
20 “DES years” the search for a successor, the AES, was started. 


2.2.2 AES 


In January 1997, the National Institute of Standards and Technology started 
an open selection process for a new encryption standard — the advanced en- 
cryption standard, or AES for short. NIST encouraged parties worldwide to 
submit proposals for the new standard. The proposals were required to sup- 
port a block size of at least 128 bits, and three key sizes of 128, 192 and 256 
bits. 

The selection process was divided into two rounds. In the first round, 15 
of the submitted 21 proposals were accepted as AES candidates. The candi- 
dates were evaluated by a public discussion. The international cryptographic 
community was asked for comments on the proposed block ciphers. Five 
candidates were chosen for the second round: MARS (IBM), RC6 (RSA), Ri- 
jndael (Daemen and Rijmen), Serpent (Anderson, Biham and Knudsen) and 
Twofish (Counterpane). Three international “AES Candidate Conferences” 
were held, and in October 2000 NIST selected the Rijndael cipher to be the 
AES (see [NIST 2000]). 

Rijndael (see [DaeRij02]) was developed by J. Daemen and V. Rijmen. It is 
an iterated block cipher and supports different block and key sizes. Block and 
key sizes of 128, 160, 192, 224 and 256 bits can be combined independently. 

The only difference between Rijndael and AES is that AES supports only 
a subset of Rijndael’s block and key sizes. The AES fixes the block length to 
128 bits, and uses the three key lengths 128, 192 and 256 bits. 

Besides encryption, Rijndael (like many block ciphers) is suited for other 
cryptographic tasks, for example, the construction of cryptographic hash 
functions (see Section 3.4.2) or pseudorandom bit generators (see Section 
2.2.3). Rijndael can be implemented efficiently on a wide range of processors 
and on dedicated hardware. 


Structure of Rijndael. Rijndael is an iterated block cipher. The itera- 
tions are called rounds. The number of rounds, which we denote by N,., de- 
pends on the block length and the key length. In each round except the final 
round, the same round function is applied, each time with a different round 
key. The round function of the final round differs slightly. The round keys 
key1,...,keyn, are derived from the secret key k by using the key schedule 
algorithm, which we describe below. 

We use the terminology of [DaeRij02] in our description of Rijndael. A 
byte, as usual, consists of 8 bits, and by a word we mean a sequence of 32 
bits or, equivalently, 4 bytes. 

Rijndael is byte-oriented. Input and output (plaintext block, key, cipher- 
text block) are considered as one-dimensional arrays of 8-bit-bytes. Both 
block length and key length are multiples of 32 bits. We denote by Nz the 
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block length in bits divided by 32 and by N;, the key length in bits divided 
by 32. Thus, a Rijndael block consists of N, words (or 4- N, bytes), and a 
Rijndael key consists of Nj, words (or 4- Nj bytes). 

The following table shows the number of rounds N, as a function of Nz 
and N;: 


No 
45 67 8 
10 11 12 1314 
1111 12 1314 
1212 12 1314 
13 13 13 13 14 
1414 14 1414 


ora a als 


In particular, AES with key length 128 bits (and the fixed AES block length 
of 128 bits) consists of 10 rounds. 

The round function of Rijndael, and its steps, operate on an intermediate 
result, called the state. The state is a block of Np words (or 4- Ny bytes). 
At the beginning of an encryption, the variable state is initialized with the 
plaintext block, and at the end, state contains the ciphertext block. 

The intermediate result state is considered as a 4-row matrix of bytes with 
Nz columns. Each column contains one of the N; words of state. 

The following table shows the state matrix in the case of block length 192 
bits. We have 6 state words. Each column of the matrix represents a state 
word consisting of 4 bytes. 


a0,0}4@0,1}@0,2|@0,3|@0,4|40,5 
41,0/41,1]@1,2}41,3/41,4/41,5 
42,0|42,1|42,2|42,3/42,4/42,5 
43,0/43,1/43,2|43,3|43,4/43,5 


The Rijndael Algorithm. An encryption with Rijndael consists of an ini- 
tial round key addition, followed by applying the round function (N,. — 1)- 
times, and a final round with a slightly modified round function. The round 
function is composed of the SubBytes, ShiftRows and MixColumns steps and 
an addition of the round key (see next section). In the final round, the Mix- 
Columns step is omitted. A high level description of the Rijndael algorithm 
follows: 
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Algorithm 2.5. 
byteString Rijndael(byteString plaintext Block, key) 
1 InitState(plaintext Block, state) 
AddKey( state, keyo) 
fori 1 to N,—1do 
SubBytes( state) 
ShiftRows( state) 
MiaColumns(state) 
AddKey(state, key;) 
SubBytes( state) 
ShiftRows(state) 
AddKey( state, keyn,.) 
return state; 


FOUMO AN DOK WD 


aoe 


The input and output blocks of the Rijndael algorithm are byte strings of 
4- N», bytes. In the beginning, the state matrix is initialized with the plaintext 
block. The matrix is filled column by column. The ciphertext is taken from 
the state matrix after the last round. Here, the matrix is read column by 
column. 

All steps of the round function — SubBytes, ShiftRows, MixColumns, Ad- 
dKey — are invertible. Therefore, decrypting with Rijndael means to apply 
the inverse functions of SubBytes, ShiftRows, MixColumns and AddKey, in 
the reverse order. 


The Round function. We describe now the steps — SubBytes, Shift Rows, 
MixColumns and AddKey — of the round function. The Rijndael algorithm 
and its steps are byte-oriented. They operate on the bytes of the state matrix. 
In Rijndael, bytes are usually considered as elements of the finite field Fos 
with 2° elements, and Fs is constructed as an extension of the field Fz with 
2 elements by using the irreducible polynomial X$ + X* + X34 X +1 (see 
Appendix A.5.3). Then adding (which is the same as bitwise XORing) and 
multiplying bytes means to add and multiply them as elements of the field 
Fos . 


The SubBytes Step. SubBytes is the only non-linear transformation of 
Rijndael. It substitutes the bytes of the state matrix byte by byte, by applying 
the function Spp° to each element of the matrix state. The function Srp is 
also called the S-box; it does not depend on the key. The same S-box is used 
for all byte positions. The S-box Srp is composed of two maps, f and g. First 
f and then g is applied: 


Srp(z) = 9° f(x) = g(f(x)) (# a byte). 


Both maps, f and g, have a simple algebraic description. 


To understand f, we consider a byte x as an element of the finite field 


F.s. Then f simply maps = to its multiplicative inverse x7!: 


° Rijmen and Daemen’s S-box. 
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x ifeA0, 
fsa Ras, 2 {9 if =0. 
To understand g, we consider a byte x as a vector of 8 bits or, more precisely, 
as a vector of length 8 over the field Fz with 2 elements®. Then g is the 
F-affine map 


g: F3 — F§, c+ Ar+b, 


composed of a linear map x +> Az and a translation with vector b. The 
matrix A of the linear map and 0b are given by 


10001111 
11000111 
11100011 
11110001 

A=1141111000| 09= 
01111100 
00111110 


00011111 


OorrOOoOOrFRrF 


The S-box Srp operates on each of the state bytes of the state matrix 
independently. For a block length of 128 bits, we have: 


9,0 |@0,1|@o,2|@0,3 Srp(a0,0)|Srp(ao0,1)|Srp (40,2) |Srp (40,3) 
41,0|@1,1/@1,2/41,3] _. |Srp(@1,0)|Srp (41,1) [SRD (1,2) |/Srp(@1,3) 
42,0/@2,1/42,2/42,3) — |SRp(a2,0)|Srp(a2,1)|Srp(a2,2) [Srp (a2,3){ 
43,0|43,1|43,2|43,3 Srp (43,0) |SRp (43,1) |SRp (43,2) [SRD (43,3) 


Both maps f and g are invertible. We even have f = f~~. Thus the S-box 
Srp is invertible and Spp~! = f~!og7! = fog. 


The ShiftRows Step. The ShiftRows transformation performs a cyclic left 
shift of the rows of the state matrix. The offsets are different for each row 
and depend on the block length Nj. 


Npll. row|2. row|3. row|4. row 
4 0 1 2 3 
5 0 1 2 3 
6 0 1 2 3 
7 0 1 2 4 
8 0 1 3 4 


For a block length of 128 bits (Ny = 4), as in AES, ShiftRows is the map 


® Recall that the field Fz with 2 elements consists of the residues modulo 2, ice., 
Fo = Z2 = {0,1}. 
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a b c d | a b c d | 
e|figo [rh | filg {rte | 
a j k mi k l a a 
|) ane all) Foe tl pee ra UE ee ee 


Obviously, ShiftRows is invertible. The inverse operation is obtained by 
cyclic right shifts with the same offsets. 


The MixColumns Step. The MixColumns transformation operates on 
each column of the state matrix independently. We consider a column 
a = (do, @1, 42,43) as a polynomial a(X) = a3X? + a2X? + a,X + ao of 
degree < 3, with coefficients in Fs. 
Then MixColumns transforms a column a by multiplying it with the fixed 
polynomial 
c(X) := 03 X° +01 X? +01 X +02 


and taking the residue of the product modulo X* +1: 
a(X) + a(X)-e(X) mod (X* +1). 


The coefficients of c are elements of F2s. Hence, they are represented as bytes, 
and a byte is given by two hexadecimal digits, as usual. 

The transformations of MixColumns, multiplying by c(X) and taking the 
residue modulo X* + 1, are F,s-linear maps. Hence MixColumns is a linear 
map of vectors of length 4 over Fs. It is given by the following 4 x 4-matrix 
over Fos: 

02 03 O01 O1 
01 02 03 O1 
01 01 02 03 
03 01 01 02 


Again, bytes are represented by two hexadecimal digits. 
MixColumns transforms each column of the state matrix independently. 
For a block length of 128 bits, as in AES, we get 


@0,0|40,1}@0,2| 20,3 [00,0 bo,1|50,2|b0,3 
41,0]41,1/41,2/41,3 : |b1,0 bit b1,2 613 
42,0 /42,1|42,2/42,3 |b2,0 bo4 b2.2 ba3) 
3,0 |43,1|@3,2|43,3 |b3,0[b3,1[b3,2|b3,3 
where 
bo, j 02 03 01 01 40,5 
bi; | _ | 0102 03 01 aij vo 
bo; | | 01 01 02 03 oh es 
bs. 03 O01 01 02 a3,j 


The polynomial c(X) is relatively prime to X*+ +1. Therefore c(X) is a 
unit modulo X* + 1. Its inverse is 
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d(X) = 0B X° + 0D X* + 09.X + 0B, 


ie., c(X)-d(X) mod (X*+1) = 1. This implies that MixColumns is invertible. 
The inverse operation is to multiply each column of the state matrix by d(X) 
modulo X* +1. 


AddKey. The operation AddKey is the only operation in Rijndael that de- 
pends on the secret key k, which is shared by the communication partners. 
It adds a round key to the intermediate result state. The round keys are de- 
rived from the secret key k by applying the key schedule algorithm, which is 
described in the next section. Round keys are bit strings and, as the interme- 
diate results state, they have block length, i.e., each round key is a sequence 
of N, words. AddKey simply bitwise XORs the state with the roundkey to 
get the new value of state: 


(state, roundkey) > state ® roundkey. 


Since we arrange state as a matrix, a round key is also represented as 
a round key matrix of bytes with 4 rows and N; columns. Each of the Ny, 
words of the round key yields a column. Then the corresponding entries of 
the state matrix and the round key matrix are bitwise XORed by AddKey 
to get the new state matrix. Note that bitwise XORing two bytes means to 
add two elements of the field Fos. 

Obviously, AddKey is invertible. It is inverse to itself. To invert it, you 
simply apply AddKey a second time with the same round key. 


The Key Schedule. The secret key k consists of N; 4-byte-words. The 
Rijndael algorithm needs a round key for each round and one round key for 
the initial key addition. Thus we have to generate N, + 1 round keys (as 
before, N,. is the number of rounds). A round key consists of N, words. If we 
concatenate all the round keys, we get a string of N,(NV, +1) words. We call 
this string the expanded key. 

The expanded key is derived from the secret key k by the key expansion 
procedure, which we describe below. The round keys 


keyo, key, key, ae keyn,. 


are then selected from the expanded key ExpKey: keyo consists of the first 
N, words of ExpKey, key, consists of the next Ny, words of ExpKey, and 
so on. 

To explain the key expansion procedure, we use functions f;, defined 
for multiples 7 of Nz, and a function g. All these functions map words 
(a0, %1, 2,23), which each consist of 4 bytes, to words. 

g simply applies the S-box Srp (see SubBytes above) to each byte: 


(x0, £1, 2, £3) + (SRp(%o), SRD(#1), Srp(X2), SRD(#3)). 


If 7 is a multiple of Nz, ie., 7 = 0 mod Nz, we define f; by 
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(x0, 21, X2, 23) ia (Srp (21) @ RC [3/N;| , ORD (x2) , ORD (x3) , SRD (xo)) . 


Here, so-called round constants RC|i] are used. They are defined as follows. 
First, recall that in our representation, the elements of Fos are the residues of 
polynomials with coefficients in Fz modulo P(X) = X8+ X4+ X84 X +1. 
Now, the round constant RC/i] € Fs is defined by RC[i] := X*~' mod P(X). 
Relying on the non-linear S-box Srp, the functions f; and g are also 
non-linear. 
We are ready to describe the key expansion. We denote by 


ExpKeylj], 0< 7 < No(N; +1), 


the words of the expanded key. The first N;, words are initialized with the 
secret key k. The following words are computed recursively. ExpK ey|[j] de- 
pends on ExpkKeyl|j — N,] and on ExpKey|j — 1]. 

Depending on the key length N;, there are two versions of the key expan- 
sion procedure, one for N; <6, the other for N; > 6. We have for Nz < 6: 


,. {| ExpKey[j — Ny] © f;)(ExpKeyl|j —1)) if 7 = 0 mod Nz, 
ee eee — Nx] © ExpKeylj — 1] if 7 # Omod Nx. 


If Ny > 6, we have: 


ExpKey|j — Nx] © f;(ExpkKey|j — 1)) if 7 = 0 mod Nx, 
BapKeylj — Nj] @ g(BapKeylj —1]) if j # 0 mod Ny 
and 7 = 4mod Nx, 
ExpKey|j — Nx] ® ExpKeylj — 1] else. 


ExpKey(j] := 


2.2.3 Modes of Operation 


Block ciphers need some extension, because in practice most of the messages 
have a size that is distinct from the block length. Often the message length 
exceeds the block length. Modes of operation handle this problem. They were 
first specified in conjunction with DES, but they can be applied to any block 
cipher. 

We consider a block cipher F with block length n. We fix a key k and, as 
usual, we denote the encryption function with this key k by 


Ex : {0,1}" —> {0,1}”, 


for example, E;, = DES;. To encrypt a message m that is longer than n bits 
we apply a mode of operation: The message m is decomposed into blocks 
of some fixed bit length r, m = m,mz2...mj,, and then these blocks are 
encrypted iteratively. The length r of the blocks m,; is not in all modes of 
operation equal to the block length n of the cipher. There are modes of 
operation, where r can be smaller than n, for example, the cipher feedback 
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and the output feedback modes below. In electronic code book mode and 
cipher-block chaining mode, which we discuss first, the block length r is 
equal to the block length n of the block cipher. 

If the block length r does not divide the length of our message, we have 
to complete the last block. The last block is padded out with some bits. 
After applying the decryption function, the receiver must remove the padding. 
Therefore, he must know how many bits were added. This can be achieved, 
for example, by storing the number of padded bits in the last byte of the last 
block. 


Electronic Codebook Mode. The electronic code book mode is the 
straightforward mode. The encryption is deterministic — identical plaintext 
blocks result in identical ciphertext blocks. The encryption works like a code- 
book. Each block of m is encrypted independently of the other blocks. Trans- 
mission bit errors in a single ciphertext block affect the decryption only of 
that block. 

In this mode, we have r = n. The electronic codebook mode is imple- 
mented by the following algorithm: 


Algorithm 2.6. 
bitString ecbEncrypt(bitString m) 
1 divide m into m,...m, 
2 fori-—1toldo 
3 Cy — Ex (mi). 
4 return c,...c 


For decryption, the same algorithm can be used with the decryption function 
Be in place of Ex. 

If we encrypt many blocks, partial information about the plaintext is 
revealed. For example, an eavesdropper Eve detects whether a certain block 
repeatedly occurs in the sequence of plaintext blocks, or, more generally, she 
can figure out how often a certain plaintext block occurs. Therefore, other 
modes of operation are preferable. 


Cipher-Block Chaining Mode. In this mode, we have r = n. Encryption 
in the cipher-block chaining mode is implemented by the following algorithm: 


Algorithm 2.7. 
bitString cbcEncrypt(bitString m) 
1 select co € {0,1}” at random 
2 divide m into m,...m 
3 fori-—1toldo 
4 c, — Ex(m; ® G_1) 
5 return coc, ...C 


Choosing the initial value cp at random prevents almost with certainty 
that the same initial value co is used for more than one encryption. This is 
important for security. Suppose for a moment that the same co is used for 
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two messages m and m’. Then, an eavesdropper Eve can immediately detect 
whether the first | blocks of m and m’ coincide, because in this case the first 
1 ciphertext blocks are the same. 

If a message is encrypted twice, then, with a very high probability, the 
initial values are different, and hence the resulting ciphertexts are distinct. 
The ciphertext depends on the plaintext, the key and a randomly chosen 
initial value. We obtain a randomized encryption algorithm. 

Decryption in cipher-block chaining mode is implemented by the following 
algorithm: 


Algorithm 2.8. 
bitString cbcDecrypt (bitString c) 
1 divide c into coc... c 
2 fori-—1toldo 
3 m, — Ex*(c) ® G1 
4 return m,...m 


The cryptogram c = coc,...c; has one block more than the plaintext. 
The initial value co needs not be secret, but its integrity must be guaranteed 
in order to decrypt c, correctly. 

A transmission bit error in block c; affects the decryption of the blocks c¢; 
and cj41. The block recovered from c; will appear random (here we assume 
that even a small change in the input of a block cipher will produce a random- 
looking output), while the plaintext recovered from cj+1 has bit errors pre- 
cisely where c; did. The block cj;2 is decrypted correctly. The cipher-block 
chaining mode is self-synchronizing, even if one or more entire blocks are lost. 
A lost ciphertext block results in the loss of the corresponding plaintext block 
and errors in the next plaintext block. 

In both the electronic codebook mode and cipher-block chaining mode, 
EY ' is applied for decryption. Hence, both modes are also applicable with 
public-key encryption methods, where the computation of LE, 1 requires the 
recipient’s secret, while E; can be easily computed by everyone. 


Cipher Feedback Mode. Let Isb; denote the / least significant (rightmost) 
bits of a bit string, msb; the / most significant (leftmost) bits of a bit string, 
and let || denote the concatenation of bit strings. 

In the cipher feedback mode, we have 1 < r < n (recall that the plaintext 
m is divided into blocks of length r). Let 21 € {0,1}” be a randomly cho- 
sen initial value. The cipher feedback mode is implemented by the following 
algorithm: 
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Algorithm 2.9. 
bitString cfbEnCrypt(bitString m, 71) 
1 divide m into m,...m, 
2 fori<1toldo 
3 Cc, — m; ® msb, (Ex (2;)) 
4 Lit — Isby_,-(2;)||c: 
5 return c,...¢ 


We get a stream cipher in this way. The key stream is computed by using 
E,, and depends on the key underlying E;, on an initial value 7; and on the 
ciphertext blocks already computed. Actually, 2;4, depends on the first [”/;| 
members’ of the sequence ¢;,¢;—1,...,C1,%1. The key stream is obtained in 
blocks of length r. The message can be processed bit by bit and messages 
of arbitrary length can be encrypted without padding. If one block of the 
key stream is consumed, the next block is computed. The initial value x1 
is transmitted to the recipient. It does not need to be secret if E, is the 
encryption function of a symmetric cryptosystem (an attacker does not know 
the key underlying F;,). The recipient can compute E;,(x1) — hence m; and 
x2 — from x; and the cryptogram ci, then E,(x2),m2 and x3, and so on. 

For each encryption, a new initial value x, is chosen at random. This 
prevents almost with certainty that the same initial value x; is used for 
more than one encryption. As in every stream cipher, this is important for 
security. If the same initial value x, is used for two messages m and m’, then 
an eavesdropper Eve immediately finds out whether the first | blocks of m 
and m’ coincide. In this case, the first | blocks of the generated key stream, 
and hence the first | ciphertext blocks are the same for m and m’. 

A transmission bit error in block c; affects the decryption of that block and 
the next ["/,] ciphertext blocks. The block recovered from ¢; has bit errors 
precisely where c; did. The next [”/;| ciphertext blocks will be decrypted 
into random-looking blocks (again we assume that even a small change in the 
input of a block cipher will produce a random-looking output). The cipher 
feedback mode is self-synchronizing after ["/,;| steps, even if one or more 
entire blocks are lost. 

Output Feedback Mode. As in the cipher feedback mode, we have 1 < 
r <n. Let x; € {0,1}” be a randomly chosen initial value. The output 
feedback mode is implemented by the following algorithm: 


Algorithm 2.10. 
bitString ofbEnCrypt (bitString m, x1) 
1 divide m into m,...m, 
2 fori<1toldo 
3 Cc, — m; ® msb, (Ex (2; )) 
4 Li41 — Ep (2) 
5 return c,...¢; 


” [x] denotes the smallest integer > x. 
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There are two different output feedback modes discussed in the literature. 
The one we introduced is considered to have better security properties and 
was specified in [ISO/IEC 10116]. In the output feedback mode, plaintexts of 
arbitrary length can be encrypted without padding. As in the cipher feedback 
mode, the plaintext is considered as a bit stream and each bit is XORed with 
a bit of a key stream. The key stream depends only on an initial value x; and 
is iteratively computed by 241 = E,(ax;). The initial value x, is transmitted 
to the recipient. It does not need to be secret if Ey is the encryption function 
of a symmetric cryptosystem (an attacker does not know the key underlying 
E;). For decryption, the same algorithm can be used. 

It is essential for security that the initial value is chosen randomly and 
independently from the previous ones. This prevents almost with certainty 
that the same initial value 71 is used for more than one encryption. If the same 
initial value x1 is used for two messages m and m’, then identical key streams 
are generated for m and m’, and an eavesdropper Eve immediately computes 
the difference between m and m’ from the ciphertexts: m@m’ = c@c’. Thus, 
it is strongly recommended to choose a new random initial value for each 
message. 

A transmission bit error in block c; only affects the decryption of that 
block. The block recovered from c; has bit errors precisely where c; did. 
However, the output feedback mode will not recover from a lost ciphertext 
block — all following ciphertext blocks will be decrypted incorrectly. 


Security. We mentioned before that the electronic codebook mode has some 
shortcomings. The question arises as to what amount the mode of operation 
weakens the cryptographic strength of a block cipher. A systematic treatment 
of this question can be found in [BelDesJokRog97]. First, a model for the 
security of block ciphers is developed, the so-called pseudorandom function 
model or pseudorandom permutation model. 

As discussed in Section 2.2, ideally we would like to choose the encryption 
function of a block cipher from the huge set of all permutations on {0,1}” 
in a truly random way. This approach might be called the “truly random 
permutation model”. In practice, we have to follow the “pseudorandom per- 
mutation model”: the encryption function is chosen randomly, but from a 
much smaller family (Fi,)kex of permutations on {0,1}”, like DES. 

In [BelDesJokRog97], the security of the cipher-block chaining mode is re- 
duced to the security of the pseudorandom family (F;)zex. Here, security of 
the family (F%,)xe~ means that no efficient algorithm is able to distinguish el- 
ements randomly chosen from (F;),e« from elements randomly chosen from 
the set of all permutations. This notion of security for pseudorandom func- 
tion families is analogously defined as the notion of computationally perfect 
pseudorandom bit generators, which will be studied in detail in Chapter 8. 
[BelDesJokRog97] also consider a mode of operation similar to the output 
feedback mode, called the XOR scheme, and its security is also reduced to 
the security of the underlying pseudorandom function family. 
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Exercises 


1. The following algorithm computes the round keys k;, 1 = 1,...,16, for 
DES from the 64-bit key k. Only 56 of the 64 bits are used and permu- 
tated. This is done by a map PC1. The result PC1(k) is divided into two 
halves, Co and Do, of 28 bits. 


Algorithm 2.11. 

bitString DES KeyGenerator (bitString k) 
1 (Co, Do) — PC1(k) 

2 fori-—1 to 16do 

3 (Ci, D;) = (LS;(C;_1), LS;(D;_1)) 
5 return k,...ky6 


Here LS; is a cyclic left shift by one position if i = 1,2,9 or 16, and by 
two positions otherwise. The maps 


PC1 : {0,1}§* — {0,1}°®, PC2: {0,1}5 —- {0,1}*8 
are defined by the tables 


PCl PC2 
57 49 41 33 25 17 9 14 17 il 24 1 = 5 
1 58 50 42 34 26 18 3 28 15 6 21 10 


10 2 59 51 43 35 27 23 19 12 4 2 8 
19 11 38 60 52 44 36 16 7 27 20 13 2 
63 55 47 39 31 23 15 41 52 31 37 «47 «55 

7 62 54 46 38 30 22 30 40 51 45 33 48 
14 6 61 53 45 37 29 44 49 39 56 34 53 
21 13 #5 28 20 12 4 46 42 50 36 29 32 


The tables are read line by line and describe how to get the images, i.e., 


POL (2a «045 264) = (a7; Cags B15 +5 Bi2, La), 


PO2( Ri ye4.5 O86) = (Bia, iF, Wij. «35 Bagy a2) 


The bits 8, 16, 24, 32, 40, 48, 56 and 64 of & are not used. They are 
defined in such a way that odd parity holds for each byte of k. A key k 
is defined to be weak if ky = kg =... = kyg. 

Show that exactly four weak keys exist, and determine these keys. 


2. In this exercise, Z denotes the bitwise complement of a bit string x. Let 
DES : {0, 1}%4 x {0,1}°° —> {0, 1}% be the DES function. 

a. Show that DES(k, Z) = DES(k, x), for k € {0,1}°8, x € {0,1}. 

b. Let (m,DES;(m)) be a plaintext-ciphertext pair. We try to find the 
unknown key / by an exhaustive key search. Show that the number 
of encryptions we have to compute can be reduced from 2°° to 2°° if 
the pair (™,DES;(™)) is also known. 
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3. The key stream in the output feedback mode is periodic, i.e., there exists 
ani € N such that x; = x;. The lowest positive integer with this property 
is called the period of the key stream. Let f be randomly chosen from 
the set of all permutations on {0,1}”. Show that the average period of 
the key stream is 2”~! + 1/o if the initial value 7, € {0,1}" is chosen at 
random. 
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The basic idea of public-key cryptography are public keys. Each person’s key 
is separated into two parts: a public key for encryption available to everyone 
and a secret key for decryption which is kept secret by the owner. In this 
chapter we introduce the concept of public-key cryptography. Then we discuss 
some of the most important examples of public-key cryptosystems, such as 
the RSA, ElGamal and Rabin cryptosystems. These all provide encryption 
and digital signatures. 


3.1 The Concept of Public-Key Cryptography 


Classical symmetric cryptography provides a secure communication channel 
to each pair of users. In order to establish such a channel, the users must 
agree on a common secret key. After establishing a secure communication 
channel, the secrecy of a message can be guaranteed. Symmetric cryptography 
also includes methods to detect modifications of messages and methods to 
verify the origin of a message. Thus, confidentiality and integrity can be 
accomplished using secret key techniques. 

However, public key techniques have to be used for a secure distribution 
of secret keys, and at least some important forms of authentication and non- 
repudiation also require public-key methods, such as digital signatures. A 
digital signature should be the digital counterpart of a handwritten signa- 
ture. The signature must depend on the message to be signed and a secret 
known only to the signer. An unbiased third party should be able to verify 
the signature without access to the signer’s secret. 

In a public-key encryption scheme, the communication partners do not 
share a secret key. Each user has a pair of keys: a secret key sk known only 
to him and a public key pk known to everyone. 

Suppose Bob has such a key pair (pk, sk) and Alice wants to encrypt a 
message m for Bob. Like everyone else, Alice knows Bob’s public key pk. She 
computes the ciphertext c = E(pk,m) by applying the encryption function 
E with Bob’s public key pk. As before, we denote encrypting with a fixed key 
pk by Epx, ie., Epe(m) := E(pk,m). Obviously, the encryption scheme can 
only be secure if it is practically infeasible to compute m from c = Epx(m). 
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But how can Bob then recover the message m from the ciphertext c? This 
is where Bob’s secret key is used. The encryption function Ep, must have 
the property that the pre-image m of the ciphertext c = E,,(m) is easy to 
compute using Bob’s secret key sk. Since only Bob knows the secret key, he 
is the only one who can decrypt the message. Even Alice, who encrypted the 
message m, would not be able to get m from E,x(m) if she lost m. Of course, 
efficient algorithms must exist to perform encryption and decryption. 

We summarize the requirements of public-key cryptography. We are look- 
ing for a family of functions (Epx),,cpx such that each function Ep, is 
computable by an efficient algorithm. It should be practically infeasible to 
compute pre-images of Epp. Such families (Bok) ok epK are called families of 
one-way functions or one-way functions for short. Here, PK denotes the set 
of available public keys.! For each function E,; in the family, there should be 
some information sk to be kept secret which enables an efficient computation 
of the inverse of E,,. This secret information is called the trapdoor informa- 
tion. One-way functions with this property are called trapdoor functions. 

In 1976, W. Diffie and M.E. Hellman published the idea of public- 
key cryptography in their famous paper “New Directions in Cryptography” 
({[DifHel76]). They introduced a public-key method for key agreement which 
is in use to this day. In addition, they described how digital signatures would 
work, and proposed, as an open question, the search for such a function. The 
first public-key cryptosystem that could function as both a key agreement 
mechanism and as a digital signature was the RSA cryptosystem published 
in 1978 ([RivShaAdl78]). RSA is named after the inventors: R. Rivest, A. 
Shamir and L. Adleman. The RSA cryptosystem provides encryption and 
digital signatures and is the most popular and widely used public-key cryp- 
tosystem today. We shall describe the RSA cryptosystem in Section 3.3. It is 
based on the difficulty of factoring large numbers, which enables the construc- 
tion of one-way functions with a trapdoor. Another basis for one-way func- 
tions is the difficulty of extracting discrete logarithms. These two problems 
from number theory are the foundations of most public-key cryptosystems 
used today. 

Each participant in a public-key cryptosystem needs his personal key 
k, = (pk, sk), consisting of a public and a secret (also called private) part. To 
guarantee the security of the cryptosystem, it must be infeasible to compute 
the secret key sk from the public key pk, and it must be possible to randomly 
choose the keys k from a huge parameter space. An efficient algorithm must 
be available to perform this random choice. If Bob wants to participate in 
the cryptosystem, he randomly selects his key k = (pk, sk), keeps sk secret 
and publishes pk. Now everyone can use pk in order to encrypt messages for 
Bob. 

To discuss the basic idea of digital signatures, we assume that we have 
a family (Epk) pep Of trapdoor functions and that each function Ep, is 


' A rigorous definition of one-way functions is given in Definition 6.12. 
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bijective. Such a family of trapdoor permutations can be used for digital 
signatures. Let pk be Alice’s public key. To compute the inverse ER of Er, 
the secret key sk of Alice is required. So Alice is the only one who is able 
to do this. If Alice wants to sign a message m, she computes Exp (m) and 
takes this value as signature s of m. Everyone can verify Alice’s signature s 
by using Alice’s public key pk and computing E,x(s). If Epx(s) =m, Bob is 
convinced that Alice really signed m because only Alice was able to compute 
Exp (m). 

An important straightforward application of public-key cryptosystems is 
the distribution of session keys. A session key is a secret key used in a classical 
symmetric encryption scheme to encrypt the messages of a single communica- 
tion session. If Alice knows Bob’s public key, then she may generate a session 
key, encrypt it with Bob’s public key and send it to Bob. Digital signatures 
are used to guarantee the authenticity of public keys by certification author- 
ities. The certification authority signs the public key of each user with her 
secret key. The signature can be verified with the public key of the certifica- 
tion authority. Cryptographic protocols for user authentication and advanced 
cryptographic protocols, like bit commitment schemes, oblivious transfer and 
zero-knowledge interactive proof systems, have been developed. Today they 
are fundamental to Internet communication and electronic commerce. 

Public-key cryptography is also important for theoretical computer sci- 
ence: theories of security were developed and the impact on complexity theory 
should be mentioned. 


3.2 Modular Arithmetic 


In this section, we give a brief overview of the modular arithmetic necessary 
to understand the cryptosystems we discuss in this chapter. Details can be 
found in Appendix A. 


3.2.1 The Integers 


Let Z denote the ordered set {...,—3,—2,—1,0,1,2,3,...}. The elements 
of Z are called integers or numbers. The integers greater than 0 are called 
natural numbers and are denoted by N. The sum n+™m and the product n-m 
of integers are defined. Addition and multiplication satisfy the axioms of a 
commutative ring with a unit element. We call Z the ring of integers. 


Addition, Multiplication and Exponentiation. Efficient algorithms ex- 
ist for the addition and multiplication of numbers.” An efficient algorithm is 
an algorithm whose running time is bounded by a polynomial in the size of its 


? Simplifying slightly, we only consider non-negative integers, which is sufficient 
for all our purposes. 
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input. The size of a number is the length of its binary encoding, i.e., the size 
of n € N is equal to |logs(n) | +1.° It is denoted by |n]|. Let a,b € N, a,b < n, 
and k := |logs(n)| +1. The number of bit operations for the computation of 
a+b is O(k), whereas for the multiplication a -b it is O(k*). Multiplication 
can be improved to O(k log,(k)) if a fast multiplication algorithm is used. 

Exponentiation is also an operation that occurs often. The repeated squar- 
ing method (Algorithm A.26) yields an efficient algorithm for the computa- 
tion of a”. It requires at most 2-|n| modular multiplications. We compute, 
for example, a’® by (((a?)?))?, which are four squarings, in contrast to the 
15 multiplications that are necessary for the naive method. If the exponent 
is not a power of 2, the computation has to be modified a little. For example, 
a4 = ((a?a)?a)? is computable by three squarings and two multiplications, 
instead of by 13 multiplications. 


Division with Remainder. If m and n are integers, m 4 0, we can divide 
n by m with a remainder. We can write n = q-m+r in a unique way such 
that 0 < r < abs(m).* The number q is called the quotient and r is called the 
remainder of the division. They are unique. Often we denote r by a mod b. 

An integer m divides an integer n if n is a multiple of m, i.e., n = mq 
for an integer g. We say, m is a divisor or factor of n. The greatest common 
divisor gcd (m,n) of numbers m,n # 0 is the largest positive integer dividing 
m and n. gcd(0,0) is defined to be zero. If gcd(m,n) = 1, then m is called 
relatively prime to n, or prime to n for short. 

The Euclidean algorithm computes the greatest common divisor of two 
numbers and is one of the oldest algorithms in mathematics: 


Algorithm 3.1. 

int gcd(int a,b) 

1 while b#40do 
2 r—amod b 

3 ab 

4 ber 

5 return abs(a) 

The algorithm computes gced(a,b) for a £ 0 and b ¥ 0. It terminates 
because the non-negative number r decreases in each step. Note that gcd(a, b) 
is invariant in the while loop, because gcd(a, b) = gcd(b, a mod b). In the last 
step, the remainder r becomes 0 and we get gcd(a, b) = gcd(a,0) = abs(a). 


Primes and Factorization. A natural number p # 1 is a prime number, 
or simply a prime, if 1 and p are the only divisors of p. If a number n € N 
is not prime, it is called composite. Primes are essential for setting up the 
public-key cryptosystems we describe in this chapter. Fortunately, there are 
very fast algorithms (so-called probabilistic primality tests) for finding — at 


3 |x| denotes the greatest integer less than or equal to 2. 
4 abs(m) denotes the absolute value of m. 
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least with a high probability (though not with mathematical certainty) — 
the correct answer to the question whether a given number is prime or not 
(see Appendix A.8). Primes are the basic building blocks for numbers. This 
statement is made precise by the Fundamental Theorem of Arithmetic. 


Theorem 3.2 (Fundamental Theorem of Arithmetic). Let n € N,n > 2. 
There exist pairwise distinct primes p,,...,Dp and exponents €),...,eK € 
N,e; > 1,i=1,...,k, such that 


k 
ei 
i=1 


The primes p,,...,Ppp and exponents €1,...,€k are unique. 


It is easy to multiply two numbers, but the design of an efficient algorithm 
for calculating the prime factors of a number is an old mathematical problem. 
For example, this was already studied by the famous mathematician C.F. 
Gau8 about 200 years ago (see, e.g., [Riesel94] for details on Gau8’ factoring 
method). However, to this day, we do not have a practical algorithm for 
factoring extremely large numbers. 


3.2.2 The Integers Modulo n 


The Residue Class Ring Modulo n. Let n be a positive integer. Let a 
and b be integers. Then a is congruent to b modulo n, written a= bmodn, 
if a and b leave the same remainder when divided by n or, equivalently, if n 
divides a — b. We obtain an equivalence relation. The equivalence class of a 
is the set of all numbers congruent to a. It is denoted by [a] and called the 
residue class of a modulo n. The set of residue classes {[a] | a € Z} is called 
the set of integers modulo n and is denoted by Zp. 

Each number is congruent to a unique number r in the range 0 < r < 
n—1. Therefore the numbers 0,...,n—1 form a set of representatives of the 
elements of Z,,. We call them the natural representatives. 

The equivalence relation is compatible with addition and multiplication 
in Z, ie., if a = a’ modn and b= b' modn, thena+b= (a +0’) modn 
and a-b = (a’-b’) mod n. Consequently, addition and multiplication on Z 
induce an addition and multiplication on Z,,: 


[a] + [6] := [a + 4), 
[a] - [6] := [a Bh 


Addition and multiplication satisfy the axioms of a commutative ring with a 
unit element. We call Z, the residue class ring modulo n. 

Although we can calculate in Z,, as in Z, there are some important differ- 
ences. First, we do not have an ordering of the elements of Z,, which is compat- 
ible with addition and multiplication. For example, if we assume that we have 
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such an ordering in Zs and that [0] < [1], then [0] < [1]+[1]+[1]+[1]+[1] = [0], 
which is a contradiction. A similar calculation shows that the assumption 
[1] < [0] also leads to a contradiction. 

Another fact is that [a] - [b] can be [0] for [a] 4 [0] and [6] F [0]. For 
example, [2] - [3] = [0] in Zg. Such elements — [a] and [6] — are called zero 
divisors. 


The Prime Residue Class Group Modulo n. In Z, elements a and b 
satisfy a:b = 1 if and only if both a and b are equal to 1, or both are equal to 
-1. We say that 1 and -1 have multiplicative inverse elements. In Z,,, this can 
happen more frequently. In Zs, for example, every class different from [0] has 
a multiplicative inverse element. Elements in a ring which have multiplicative 
inverses are called units and form a group under multiplication. 

An element [a] in Z, has the multiplicative inverse element [b], if ab = 
1 mod n or, equivalently, n divides 1 — ab. This means we have an equation 
nm + ab = 1, with suitable m. The equation implies that gcd(a,n) = 1. On 
the other hand, if numbers a,n with gcd(a,n) = 1 are given, an equation 
nm + ab = 1, with suitable 6 and m, can be derived from a and n by the 
extended Euclidean algorithm (Algorithm A.5). Hence, [a] is a unit in Z,, and 
the inverse element is [b]. Thus, the elements of the group of units of Z,, are 
represented by the numbers prime to n. 


Zy := {[a] | 1<a<n-1 and gcd(a,n) = 1} 


is called the prime residue class group modulo n. The number of elements 
in Z* (also called the order of Z*) is the number of integers in the interval 
[1,2 —1] which are prime to n. This number is denoted by y(n). The function 
y is called the Euler phi function or the Euler totient function. 

For every element a in a finite group G, we have al@! = e, with e being 
the neutral element of G.° This is an elementary and easy to prove feature 
of finite groups. Thus, we have for a number a prime to n 


a? = 1modn. 


This is called Euler’s Theorem or, if n is a prime, Fermat’s Theorem. 
If fi. p;’ is the prime factorization of n, then the Euler phi function 
can be computed by the formula (see Corollary A.30) 


If p is a prime, then every integer in {1,...,p — 1} is prime to p. Therefore, 
every element in Z, \ {0} is invertible and Z, is a field. The group of units 
Z* is a cyclic group with p— 1 elements, i-e., ZS = {g,g7,...,g?~* = [1]} 


° |G| denotes the number of elements of G (called the cardinality or order of G). 
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for some g € Zp_1. Such a g is called a generator of Z). Generators are also 
called primitive elements modulo p or primitive roots modulo p (see Definition 
A.37). 

We can now introduce three functions which may be used as one-way 
functions and which are hence very important in cryptography. 


Discrete Exponentiation. Let p denote a prime number and g be a prim- 
itive root in Zp. 


* 


Exp : Zp,_1 — Zp, 


Lr g® 


is called the discrete exponential function. Exp is a homomorphism from 
the additive group Z,_; to the multiplicative group Zy, Le., Exp(2# + y) = 
Exp(x)-Exp(y), and Exp is bijective. In other words, Exp is an isomorphism 
of groups. This follows immediately from the definition of a primitive root. 
The inverse function 


Log : Zi — Zy-1 


is called the discrete logarithm function. We use the adjective “discrete” to 
distinguish Exp and Log for finite groups from the classical functions defined 
for the reals. 

Exp is efficiently computable, for example by the repeated squaring 
method (see Section 3.2.1), whereas no efficient algorithm is known to ex- 
ist for computing the inverse function Log for sufficiently large primes p. 
This statement is made precise by the discrete logarithm assumption (see 
Definition 6.1). 


Modular Powers. Let n denote the product of two distinct primes p and q 
and let e be prime to y(n). 


RSA- : Z,n — Zn, 2-2 
is called the RSA function. 


Proposition 3.3. Using the same notation as above, let d be a multiplicative 
inverse element of e modulo y(n) (note that d is also prime to p(n) and RSAq 
is defined). Then 


RSAg ° RSA. = RSA. fe} RSAqg = idz,,. 


Proof. We show x4 = z, for x € Zp. First let x € Z*. The group Z* has order 
y(n), hence «(") = [1] and therefore x°¢ = a¢¢™o4 o(™) — x, In the case x ¢ 
Z*, por q isa factor of x. If both divide x, we have « = 0 and x°¢ = 0. Thus 
the equalities hold. Observe that y(n) = (p—1)(q—1) (see Corollary A.30). If 
p divides x and q does not divide x, then (2°)? mod p = 0, 2 mod p = 0 and 
(a°)4 = get¢mod (9-1) = x mod q, because ed = 1 mod (q— 1). This shows 
that (x°)4 = «mod n. The case where p does not divide x and gq divides x 
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follows analogously. Thereby (x°)4 = x4 


proven our assertion. 


= « for all x € Z,, and we have 


We see that RSA, is an (easily computable) permutation of Z,. Knowing 
d, it is also easy to compute the inverse, which is simply RSAg. However, if 
d is a secret, it is believed to be infeasible to invert RSA. (provided that p 
and q are very large). 


Modular Squares. Let p and gq denote distinct prime numbers and n = pq. 


Square : Z, —> Zn, e+ 2? 

is called the Square function. Each element y € Z,,y 4 0, either has 0, 2 
or 4 pre-images. If both p and q are primes = 3 mod 4, then -1 is not a 
square modulo p and modulo gq, and it easily follows that Square becomes a 
bijective map by restricting the domain and range to the subset of squares in 
Z* (for details see Appendix A.6). If the factors of n are known, pre-images 
of Square (called square roots) are efficiently computable (see Proposition 
A.62). Again, without knowing the factors p and q, computing square roots 
is practically impossible for p,q sufficiently large. 


On the Difficulty of Extracting Roots and Discrete Logarithms. Let 
g,k EN, g,k = 2, and let 
F:Z—Z 


denote one of the maps x +> 27,2 +> a* or x +> g®. Usually these maps 


are considered as real functions. Efficient algorithms for the computation of 
values and pre-images are known. These algorithms rely heavily on the order- 
ing of the reals and the fact that F is, at least piecewise, monotonic. The map 
F is efficiently computable by integer arithmetic with a fast exponentiation 
algorithm (see Section 3.2.1). If we use such an algorithm to compute F’, we 
get the following algorithm for computing a pre-image x of y = F(z). 

For simplicity, we restrict the domain to the positive integers. Then all 
three functions are monotonic and hence injective. It is easy to find integers 
aand b witha<a<b. If F(a) # y and F(b) 4 y, we call Flnvers(y, a, b). 


Algorithm 3.4. 

int FInvers(int y, a, b) 

1 repeat 

2 c< (a+b) div 2 
3 if F(c) <y 
4 then a—c 
5 else bic 
6 until F(c) =y 
7 return c 
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F is efficiently computable. The repeat-until loop terminates after 
O(log,(|b — al)) steps. Hence, FInvers is also efficiently computable, and we 
see that F' considered as a function from Z to Z can easily be inverted in an 
efficient way. 

Now consider the same maps modulo n. The function F is still efficiently 
computable (see above or Algorithm A.26). The algorithm FInvers, however, 
does not work in modular arithmetic. The reason is that in modular arith- 
metic, it does not make sense to ask in line 3 whether F'(c) < y. The ring Zp, 
has no order which is compatible with the arithmetic operations. The best we 
could do to adapt the algorithm above for modular arithmetic is to test all 
elements cin Z,, until we reach F'(c) = y. However, this leads to an algorithm 
with exponential running time (n is exponential in the binary length |n| of 

If the factors of n are kept secret, no efficient algorithm is known today 
to invert RSA and Square. The same holds for Exp. It is widely believed 
that no efficient algorithms exist to compute the pre-images. No one could 
prove, however, this statement in the past. These assumptions are defined 
in detail in Chapter 6. They are the basis for security proofs in public-key 
cryptography (Chapters 9 and 10). 
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The RSA cryptosystem is based on facts from elementary number theory 
which have been known for 250 years. To set up an RSA cryptosystem, we 
have to multiply two very large primes and make their product n public. n is 
part of the public key, whereas the factors of n are kept secret and are used 
as the secret key. The basic idea is that the factors of n cannot be recovered 
from n. In fact, the security of the RSA encryption function depends on the 
tremendous difficulty of factoring, but the equivalence is not proven. 

We now describe in detail how RSA works. We discuss key generation, 
encryption and decryption as well as digital signatures. 


3.3.1 Key Generation and Encryption 


Key Generation. Each user Alice of the RSA cryptosystem has her own 
public and secret keys. The key generation algorithm proceeds in three steps 
(see also Section 6.4): 


1. Choose large distinct primes p and qg, and compute n = p- q. 

2. Choose e that is prime to y(n). The pair (n,e) is published as the public 
key. 

3. Compute d with ed = 1 mod y(n). (n,d) is used as the secret key. 
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Recall that y(n) = (p — 1)(q — 1) (Corollary A.30). The numbers n, e and 
d are referred to as the modulus, and encryption and decryption exponents, 
respectively. To decrypt a ciphertext or to generate a digital signature, Alice 
only needs her decryption exponent d, she does not need to know the primes 
p and q. Nevertheless, knowing p and q can be helpful for her (e.g. to speed 
up decryption, see below). At any time, Alice can derive the primes from n, e 
and d by an efficient algorithm with very high probability (see Exercise 4). 

An adversary should not have the slightest idea what Alice’s primes are. 
Therefore, we proceed as follows to get the primes p and q. First we choose 
a large number x at random. If x is even, we replace x by x + 1 and apply 
a probabilistic primality test to check whether « is a prime (see Appendix 
A.8). If x is not a prime number, we replace « with x + 2, and so on until 
the first prime is reached. We expect to test O(In(z)) numbers for primality 
before reaching the first prime (see Corollary A.69). The method described 
does not produce primes with mathematical certainty (we use a probabilistic 
primality test), but it is sufficient for practical purposes. At the moment, it 
is suggested to take 512-bit prime numbers. No one can predict for how long 
such numbers will be secure, because it is difficult to predict improvements 
in factorization algorithms and computer technology. 

The number e can also be chosen at random. Whether e is prime to 
y(n) is tested with Euclid’s algorithm (Algorithm A.4). Another method for 
obtaining e is to choose a prime between max(p, gq) and y(n) which guarantees 
that it will be relatively prime to y(n). We can do this in the same way 
as choosing p and gq. The number d can be computed with the extended 
Euclidean algorithm (Algorithm A.5). 

To choose a number at random, we may use a pseudorandom number 
generator. This is an algorithm that generates a sequence of digits which 
look like a sequence of random digits. There is a wide array of literature 
concerning efficient and secure generation of pseudorandom numbers (see, 
e.g., [MenOorVan96]). We also discuss the subject in Chapter 8. 


Encryption and Decryption. We encrypt messages in {0,...,2—1}, con- 
sidered as elements of Z,,. 


1. The encryption function is defined by 
E:Z, — Zn, t+ x. 


2. The decryption function is of the same type, and is defined by 


D: Zn — Zn, tro xt. 
E and D are bijective maps and inverse to each other: Eo D = DoE = idz,, 
(see Proposition 3.3). Encryption and decryption can be implemented using 
an efficient algorithm (Algorithm A.26). 
With the basic encryption procedure, we can encrypt bit sequences up 
to k := |logs(n)| bits. If our messages are longer, we may decompose them 
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into blocks of length & and apply the scheme described in Section 3.3.4 or a 
suitable mode of operation, for example the electronic codebook mode or the 
cipher-block chaining mode (see Section 2.2.3). The cipher feedback mode 
and the output feedback mode are not immediately applicable. Namely, if 
the initial value is not kept secret everyone can decrypt the cryptogram. See 
Chapter 9 for an application of the output feedback mode with RSA. 


Security. An adversary knowing the factors p and q of n also knows y(n) = 
(p — 1)(q—1), and then derives d from the public encryption key e using the 
extended Euclidean algorithm (Algorithm A.5). Thus, the security of RSA 
depends on the difficulty of finding the factors p and q of n, if p and q are 
large primes. It is widely believed that it is impossible today to factor n by an 
efficient algorithm if p and q are sufficiently large. This fact is known as the 
factoring assumption (for a precise definition, see Definition 6.9). An efficient 
factoring algorithm would break RSA. It is not proven whether factoring is 
necessary to decrypt RSA ciphertexts, but it is also believed that inverting 
the RSA function is intractable. This statement is made precise by the RSA 
assumption (see Definition 6.7). 

In the construction of the RSA keys, the only inputs for the computation 
of the exponents e and d are y(n) and n. Since y(n) = (p — 1)(q — 1), we 
have 


p+q=n-— y(n) +1 and p—q= V/V (p+ 9)? — 4n (if p > q). 


Therefore, it is easy to compute the factors of n if y(n) is known. 

The factorization of n can be reduced to an algorithm A that computes d 
from n and e (see Exercise 4). The resulting factoring algorithm A’ is prob- 
abilistic and of the same complexity as A. (This fact was already mentioned 
in [RivShaAd178]). 

It is an open question as to whether an efficient algorithm for factoring can 
be derived from an efficient algorithm inverting RSA, i.e., an efficient algo- 
rithm that on inputs n,e and x® outputs z. A result of Boneh and Venkatesan 
(see [BonVen98]) provides evidence that, for a small encryption exponent e, 
inverting the RSA function might be easier than factoring n. They show that 
an efficient factoring algorithm A which uses, as a subroutine, an algorithm 
for computing e-th roots — called oracle for e-th roots — can be converted into 
an efficient factoring algorithm B which does not call the oracle. However, 
they use a restricted computation model for the algorithm A — only alge- 
braic reductions are allowed. Their result says that factoring is easy, if an 
efficient algorithm like A exists in the restricted computational model. The 
result of Boneh and Venkatesan does not expose any weakness in the RSA 
cryptosystem. 

The decryption exponent d should be greater than n‘/+. For d < n!/4,a 
polynomial-time algorithm to compute d has been developed ([Wiener90}). 
The algorithm uses the continued fraction expansion of &/n. 
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Efficient factoring algorithms are known for special types of primes p and 
q. To give these algorithms no chance, we have to avoid such primes. First 
we require that the absolute value |p— q| is large. This prevents the following 
attack: We have (p + 4)"/4—n = (P+ 9)"/4—pq = (P- 9)" /4. If |p—q| is small, 
then (» — @)*/4 is also small and therefore (p + 4)"/, is slightly larger than n. 
Thus P + 4/g is slightly larger than ,/n and the following factoring method 
could be successful: 


1. Choose successive numbers x > \/n and test whether x? — n is a square. 
2. In this case, we have x? —n = y?. Thus 2? — y? = (ex —y)(x@t+y) =n, 
and we have found a factorization of n. 


This idea for factoring numbers goes back to Fermat. 

To prevent other attacks on the RSA cryptosystem, the notion of strong 
primes has been defined. A prime number is called strong if the following 
conditions are satisfied: 


1. p—1 has a large prime factor, denoted by r. 
2. p+1 has a large prime factor. 
3. r—1 has a large prime factor. 


What “large” means can be derived from the attacks to prevent (see below). 
Strong primes can be generated by Gordon’s algorithm (see [Gordon84]). If 
used in conjunction with a probabilistic primality test, the running time of 
Gordon’s algorithm is only about 20% more than the time needed to generate 
a prime factor of the RSA modulus in the way described above. Gordon’s 
algorithm yields a prime with high probability. The size of the resulting prime 
p can be controlled to guarantee a large absolute value |p — gq]. 

Strong primes are intended to prevent the p—1 and the p+ 1 factoring 
attacks. These are efficient if p — 1 or p+ 1 have only small prime factors 
(see, e.g., [Forster96]; [Riesel94]). Note that p— 1 and p+ 1 can be expected 
to have a large prime factor if the prime p is chosen large and at random. 
Moreover, choosing strong primes does not increase the protection against 
factoring attacks with a modern algorithm like the number-field sieve (see 
[Cohen95]). Thus, the notion of strong primes has lost significance. 

There is another attack which should be prevented by strong primes: 
decryption by iterated encryption. The idea is to repeatedly apply the en- 
il 


e 


cryption algorithm to the cryptogram until c = ce. Then c= (c , and 


the plaintext m = ce’ can be recovered. Condition 1 and 3 ensure that 
this attack fails, since the order of ¢ in Z;, and the order of e in Z{,,,) are, 
with high probability, very large (see Exercises 6 and 7). If p and q are cho- 
sen at random and are sufficiently large, then the probability of success of a 
decryption-by-iterated-encryption attack is negligible (see [MenOorVan96], p. 
313). Thus, to prevent this attack, there is no compelling reason for choosing 
strong primes, too. 
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Speeding Up Encryption and Decryption. The modular exponentia- 
tion algorithm is especially efficient, if the exponent has many zeros in its 
binary encoding. For each zero we have one less multiplication. We can take 
advantage of this fact by choosing an encryption exponent e with many zeros 
in its binary encoding. The primes 3, 17 or 2'6 + 1 are good examples, with 
only two ones in their binary encoding. 

The efficiency of decryption can be improved by use of the Chinese Re- 
mainder Theorem (Theorem A.29). The receiver of the message knows the 
factors p and q of the modulus n. Let ¢@ be the isomorphism 


@: Zn —> Zp X Zq, [x] > ([x mod pj, [x mod q]). 


Compute c? = ¢7!(¢(c“)) = ¢71((e mod p)¢, (c mod q)*). The computation 
of (c mod p)¢ and (c mod q)¢ is executed in Z, and Z,, respectively. In Z, and 
Zq we have much smaller numbers than in Z,,. Moreover, the decryption expo- 
nent d can be replaced by d mod (p—1) and d mod (q—1), respectively, since 
(c mod p)¢ mod p = (c mod p)¢™°4 ®-) mod p and (c mod q)? mod q = 
(c mod q)4™°4 G—-) mod gq (by Proposition A.24, also see “Computing mod- 
ulo a prime” on page 303). 


3.3.2 Digital Signatures 


The RSA cryptosystem may also be used for digital signatures. Let (n,e) be 
the public key and d be the secret decryption exponent of Alice. We first 
discuss signing messages that are encoded by numbers m € {0,...,2—1}. As 
usual, we consider those numbers as the elements of Z,,, and the computations 
are done in Z,,. 


Signing. If Alice wants to sign a message m, she uses her secret key and 
computes her signature o = m% of m by applying her decryption algorithm. 
We call (m,o) a signed message. 


Verification. Assume that Bob received a signed message (m,o) from Alice. 
To verify the signature, Bob uses the public key of Alice and computes o°. 
He accepts the signature if 0° = m. 


If Alice signed the message, we have (m“)* = (Eo D)(m) = m and Bob 
accepts (see Proposition 3.3). However, the converse is not true. It might 
happen that Bob accepts a signature not produced by Alice. Suppose Eve 
uses Alice’s public key, computes m® and says that (m*,m) is a message 
signed by Alice. Everyone verifying Alice’s signature gets m° = m® and is 
convinced that Alice really signed the message m°. The message m*® is not 
likely to be meaningful if the message belongs to some natural language. 
This example shows that RSA signatures can be existentially forged. This 
means that an adversary can forge a signature for some message, but not for 
a message of his choice. 
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Another attack uses the fact that the RSA encryption and decryption 
functions are ring homomorphisms (see Appendix A.3). The image of a prod- 
uct is the product of the images and the image of the unit element is the unit 
element. If Alice signed m, and mg, then the signatures for mymz and my- 
are 010 and oo, These signatures can easily be computed without the secret 
key. We will now discuss how to overcome these difficulties. 


Signing with Redundancy and Hash Functions. If the messages to be 
signed belong to some natural language, it is very unlikely that the above 
attacks will succeed. The messages m° and m,mzg will rarely be meaningful. 
When embedding the messages into {0,1}*, the message space is sparse. The 
probability that a randomly chosen bit string belongs to the message space 
is small. By adding redundancy to each message we can always guarantee 
that the message space is sparse, even if arbitrary bit strings are admissible 
messages. A possible redundancy function is 


R: {0,1}* — {0,1}*, aH ala. 


This principle is also used in error-detection and error-correction codes. Dou- 
bling the message we can detect transmission errors if the first half of the 
transmitted message does not match the second half. The redundancy func- 
tion R, has the additional advantage that the composition of R with the RSA 
function no longer preserves products. 

If the message does not need to be recovered from the signature, another 
approach to prevent the attacks is the use of a hash function (see Section 
3.4). 


3.3.3 Attacks Against RSA 


We now describe attacks not primarily directed against the RSA algorithm 
itself, but against the environment in which the RSA cryptosystem is used. 


The Common-Modulus Attack. Suppose two users Bob and Bridget of 
the RSA cryptosystem have the same modulus n. Let (n, e1) be the public key 
of Bob and (n, e2) be the public key of Bridget, and assume that e; and e2 are 
relatively prime. Let m € Z, be a message sent to both Bob and Bridget and 
encrypted as c; =m, i = 1,2. The problem now is that the plaintext can be 
computed from cj, c2,e1,e2 and n. Since e; is prime to eg, integers r and s 
with re; +se2 = 1 can be derived by use of the extended Euclidean algorithm 
(see Algorithm A.5). Either r or s, say r, is negative. If c: ¢ Z*, we can factor 
n by computing gcd(ci, 7), thereby breaking the cryptosystem. Otherwise we 
again apply the extended Euclidean algorithm and compute Ge We can 
recover the message m using (c;')~"c§ = (m*)"(m%)* = mreitse2 = m., 
Thus, the cryptosystem fails to protect a message m if it is sent to two users 
with common modulus whose encryption exponents are relatively prime. 
With common moduli, secret keys can be recovered. If Bob and Brid- 
get have the same modulus n, then Bob can determine Bridget’s secret key. 
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Namely, either Bob already knows the prime factors of n or he can compute 
them from his encryption and decryption exponents, with a very high proba- 
bility (see Exercise 4). Therefore, common moduli should be avoided in RSA 
cryptosystems — each user should have his own modulus. If the prime factors 
(and hence the modulus) are randomly chosen, as described above, then the 
probability that two users share the same modulus is negligibly small. 


Low-Encryption-Exponent Attack. Suppose that the RSA cryptosystem 
will be used for & users, and each user has a small encryption exponent. We 
discuss the case of three users, Bob, Bridget and Bert, with public keys (n;, 3), 
i = 1,2,3. Of course, the moduli n; and n; must satisfy ged(ni,n;) = 1, 
for i € j, since otherwise factoring of nj; and n; is possible by computing 
gcd(n;,n,;). We assume that Alice sends the same message m to Bob, Bridget 
and Bert. The following attack is possible: let ¢, := m° mod nj, cz := m°> mod 
ng and cz := m? mod nz. The inverse of the Chinese Remainder isomorphism 
(see Theorem A.29) 


gp: Laviang ns Zny x Zino x Zing 


can be used to compute m° mod njngn3. Since m? < njngng, we can get m 
by computing the ordinary cube root of m° in Z. 


Small-Message-Space Attack. If the number of all possible messages is 
small, and if these messages are known in advance, an adversary can encrypt 
all messages with the public key. He can decrypt an intercepted cryptogram 
by comparing it with the precomputed cryptograms. 


A Chosen-Ciphertext Attack. In the first phase of a chosen-ciphertext 
attack against an encryption scheme (see Section 1.3), adversary Eve has 
access to the decryption device. She obtains the plaintexts for ciphertexts of 
her choosing. Then, in the second phase, she attempts to decrypt another 
ciphertext for which she did not request decryption in the first phase. 

Basic RSA encryption does not resist the following chosen-ciphertext at- 
tack. Let (n,e) be Bob’s public RSA key, and let c be any ciphertext, en- 
crypted with Bob’s public key. 

To find the plaintext m of c, adversary Eve first chooses a random unit 
r € Z* and requests the decryption of the random-looking message 


r°emod n. 


She obtains the plaintext m = rm mod n. Then, in the second phase of the 
attack, Eve easily derives the plaintext m, because we have in Zy, 


rim=r rm =m. 


The attack relies on the fact that the RSA function is a ring isomorphism. 
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Analogously, there is a chosen-plaintext attack against digital signatures® 
generated with basic RSA (i.e., RSA without a hash function). Eve can suc- 
cessfully forge Bob’s signature for a message m, if she is first supplied with 
a valid signature for r°m mod n, where r is randomly chosen by Eve. 

A setting in which the chosen-ciphertext attack against RSA encryption 
may work is described in the following attack. 


Attack on Encryption and Signing with RSA. The attack is possible if 
Bob has only one public-secret RSA key pair and uses this key pair for both 
encryption and digital signatures. Assume that the cryptosystem is also used 
for mutual authentication. On request, Bob proves his identity to Eve by 
signing a random number, supplied by Eve, with his secret key. Eve verifies 
the signature of Bob with his public key. In this situation, Eve can successfully 
attack Bob as follows. Suppose Eve intercepts a ciphertext c intended for Bob: 


1. Eve selects r € Z* at random. 

2. Eve computes x = rc mod n, where (n,e) is the public key of Bob. She 
sends x to Bob to get a signature x@ (d the secret key of Bob). Note that 
x looks like a random number to Bob. 

3. Eve computes r~'a%, which is the plaintext of c. 


Bleichenbacher’s 1-Million-Chosen-Ciphertext Attack. The 1-Million- 
Chosen-Ciphertext Attack of Bleichenbacher ((Bleichenbacher98]) is an at- 
tack against PKCS#1 v1.5 ({[RFC 2313]).7 The widely used PKCS#1 is part 
of the Public-Key Cryptography Standards series PKCS. These are de facto 
standards that are developed and published by RSA Security ([RSALabs]) in 
conjunction with system developers worldwide. The RSA standard PKCS#1 
defines mechanisms for encrypting and signing data using the RSA public 
key system. We explain Bleichenbacher’s attack against encryption. 

Let (n,e) be a public RSA key with encryption exponent e and modulus 
n. The modulus n is assumed to be a k-byte integer, ie., 256"! < n < 
256". PKCS#1 defines a padding format. Messages (which are assumed to 
be shorter than & bytes) are padded out to obtain a formatted plaintext block 
m consisting of k bytes, and this plaintext block is then encrypted by using 
the RSA function. The ciphertext is m© mod n, as usual. The first byte of the 
plaintext block m is 00, and the second byte is 02 (in hexadecimal notation). 
Then a padding string follows. It consists of at least 8 randomly chosen bytes, 
different from 00. The end of the padding block is marked by the zero byte 
00. Then the original message bytes are appended. After the padding, we get 


m = 00|02||padding string|00Joriginal message. 


The leading 00-byte ensures that the plaintext block, when converted to an 
integer, is less than the modulus. 


® A detailed discussion of types of attacks against digital signature schemes is 
given in Section 10.1. 
” PKCS#1 has been updated. The current version is Version 2.1 ([RFC 3447]). 
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We call a k-byte message m PKCS conforming, if it has the above format. 
A message m € Z is PKCS conforming, if and only if 


2B<m<3B-1, 


with B = 256*-?. 

Adversary Eve wants to decrypt a ciphertext c. The attack is an adaptively- 
chosen-ciphertext attack (see Section 1.3). Eve chooses ciphertexts c1,¢2,..., 
different from c, and gets information about the plaintexts from a “decryption 
oracle” (imagine that Eve can supply ciphertexts to the decryption device and 
obtain some information on the decryption results). Adaptively means that 
Eve can choose a ciphertext, get information about the corresponding plain- 
text and do some analysis. Depending on the results of her analysis, she can 
choose a new ciphertext, and so on. With the help of the oracle, she computes 
the plaintext m of the ciphertext c. If Eve does not get the full plaintexts 
of the ciphertexts c,,c2,..., aS in Bleichenbacher’s attack, such an attack is 
also called, more precisely, a partial chosen-ciphertext attack. 

In Bleichenbacher’s attack, on input of a ciphertext, the decryption oracle 
answers whether the corresponding plaintext is PKCS conforming or not. 
There were implementations of the SSL/TLS-protocol that contained such 
an oracle in practice (see below). 

Eve wants to decrypt a ciphertext c which is the encryption of a PKCS 
conforming message m. She successively constructs intervals [a;, bj] C Z,i = 
0,1,2,..., which all contain m and become shorter in each step, usually by a 
factor of 2. Eve finds m as soon as the interval has become sufficiently small 
and contains only one integer. 

Eve knows that m is PKCS conforming and hence in [2B,3B — 1]. Thus, 
she starts with the interval [ag, bo] = [2B,3B — 1]. In each step, she chooses 
integers s, computes the ciphertext 


é:= s°cmodn 
of sm mod n and queries the oracle with input c. The oracle outputs whether 
sm mod n 


is PKCS conforming or not. Whenever sm mod n is PKCS conforming, Eve 
can narrow the interval [a,b]. The choice of the multipliers s depends on 
the output of the previous computations. So, the ciphertexts c are chosen 
adaptively. 

We now describe the attack in more detail. 
Let [ao, bo] = [2B,3B — 1]. We have m € [ao, bo], and the length of [ao, bo] is 
B-1. 
Step 1: Eve searches for the smallest integer s; > 1, such that s;m mod n 
is PKCS conforming. Since 2m > 4B, the residue s;m mod n can be PKCS 
conforming, only if s3m >n-+2B. Therefore, Eve can start her search with 
8, > [P+ 2B/3p _ 4]. 


50 3. Public-Key Cryptography 


We have s,m € [s140, $100]. If s1:m mod n is PKCS conforming, then 
dg ttn < sym < bg + tn 


for some t € N with syag < bp + tn and ap + tn < 8 1bo. 
This means that m is contained in one of the intervals 


[a1,2,b1,2] := [ao, bo] N [40 + t7/,,, 50 + tn/,,] , 


with [$140 — bo/,] < t < |S1bo — 40/,|. We call the intervals [a1,1, 61,1] the 
candidate intervals of step 1. They are pairwise disjoint and have length 
< B/s,. 

Step 2: Eve searches for the smallest integer s2, $2 > 5 ,, such that sgm mod 
n is PKCS conforming. With high probability (see [Bleichenbacher98]), only 
one of the candidate intervals [a1 ,z, 61,4] of step 1 contains a message x, such 
that sex mod n is PKCS conforming. Eve can easily find out whether an 
interval [a, b] contains a message x, such that s2% mod n is PKCS conforming. 
By comparing the interval boundaries, she simply checks whether 


[soa, s2b]M [ao + rn, bo + rn] #0 for some r with [$24/n] <r < |524/y]. 


By performing this check for all of the candidate intervals [a1 4, 014] of step 
1, Eve finds the candidate interval containing m. We denote this interval by 
[a1,0:]. With a high probability, we have s2(b) — a1) < n— B and then, 
[S2d1, $2b1| is sufficiently short to meet only one of the intervals [a9 + rn, bo + 
rn], r € N, say for r = rg. Now, Eve knows that 


ME [a2, ba] — (a1, bi] a [a0 a ran/.., bo + ran/..] . 


The length of this interval is < B/... 

In the rare case that more than one of the candidate intervals of step 1 

contains a message x, such that sox is PKCS conforming, or that more than 
one values for rz exist, Eve is left with more than one interval [a2, bz]. Then, 
she repeats step 2, starting with the candidate intervals [a2, be] in place of the 
candidate intervals [a, ,, 5,4] (and searching for s4 > sg, such that s,m mod n 
is PKCS conforming). 
Step 3: Step 3 is repeatedly executed, until the plaintext m is determined. 
Eve starts with [a2, be] and (r2, 82), which she has computed in step 2. She 
iteratively computes pairs (r;, s;) and intervals [a;, b;] of length < B/s,, such 
that m € [a;,b;]. The numbers r;,s; are chosen, such that 


1. si ~ 25;-1, 
2. [s;aj_-1, 8;Di-1] () [ao + rin, bo + rjn| # 0, 
3. s;m is PKCS conforming. 


The number of multipliers s that Eve has to test by querying the decryption 
oracle is much smaller than in steps 1 and 2. She searches for s; only in 
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the neighborhood of 2s;-;. This kind of choosing s; works, because, after 
step 2, the intervals [a;,b;] are sufficiently small. The length of [a;_-1, b;-1] 
is < B/,, , and s; © 2s;_1. Hence, the length of the interval [s;a;_1, s;b;-1| 
is less than ~ 2B, and it therefore meets at most one of the intervals [aj + 
rn, bo +rnj,r € N, say for r = r;. From properties 2 and 3, we conclude that 
sym € [ag + rin, bo + rin]. Eve sets 


[a;, bi] = [ag_1, Bi_-1] M [20 + Ti7/g,, 00 + TiM/ 5) . 


Then, [a;,b;] contains m and its length is < B/,.. 

The upper bound 8/;, for the length of [a;,b;] decreases by a factor of 
two in each iteration (s; + 2s;_1). Step 3 is repeated, until [a,, b;] contains 
only one integer. This integer is the searched plaintext m. 

The analysis in [Bleichenbacher98] shows that for a 1024-bit modulus n, 
the total number of ciphertexts, for which Eve queries the oracle, is typically 
about 27°. 

Chosen-ciphertext attacks were considered to be only of theoretical inter- 
est. Bleichenbacher’s attack proved the contrary. It was used against a web 
server with the ubiquitous SSL/TLS protocol ([RFC 4346]). In the interac- 
tive key establishment phase of SSL/TLS, a secret session key is encrypted 
by using RSA and PKCS#1 v1.5. For a communication server it is natural to 
process many messages, and to report the success or failure of an operation. 
Some implementations of SSL/TLS reported an error to the client, when the 
RSA-encrypted message was not PKCS conforming. Thus, they could be used 
as the oracle. The adversary could anonymously attack the server, because 
SSL/TLS is often applied without client authentication. To prevent Bleichen- 
bacher’s attack, the implementation of SSL/TLS-servers was improved and 
the PKCS#1 standard was updated. Now it uses the OAEP padding scheme, 
which we describe below in Section 3.3.4. 

Encryption schemes that are provably secure against adaptively-chosen- 
ciphertext attacks are studied in more detail in Section 9.5. 

The predicate “PKCS conforming” reveals one bit of information about 
the plaintext. Bleichenbacher’s attack shows that, if we were able to com- 
pute the bit “PKCS conforming” for RSA-ciphertexts, then we could easily 
compute complete (PKCS conforming) plaintexts from the ciphertexts. To 
compute the bit “PKCS conforming” from c is as difficult as to compute m 
from c. Such a bit is called a secure bit. The bit security of one-way functions 
is carefully studied in Chapter 7. There we show, for example, that the least 
significant bit of an RSA-encrypted message is as secure as the whole mes- 
sage. In particular, we develop an algorithm that inverts the RSA function 
given an oracle with only a small advantage on the least significant bit. 


3.3.4 Probabilistic RSA Encryption 


Before applying an encryption algorithm such as RSA, preprocessing of the 
message is necessary. The message is divided into blocks and then some 
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padding or formatting mechanisms are performed. Such preprocessing is pro- 
vided by the OAEP (optimal asymmetric encryption padding) scheme. It is 
not only applicable with the RSA cryptosystem; it can be used with any en- 
cryption scheme based on a bijective trapdoor function f, such as the RSA 
function or the modular squaring function of Rabin’s cryptosystem (see Sec- 
tion 3.6.1). 

In addition to the trapdoor function 


f:D— D,Dc {0,1}", 
a pseudorandom bit generator 
Ge {0, hf — {0, 1} 


and a hash function 
h: {0,1}! —> {0,1}* 


are used, with n = 1+ k. Given a random seed s € {0,1}* as input, G 
generates a pseudorandom bit sequence of length | (see Chapter 8 for more 
details on pseudorandom bit generators). Hash functions will be discussed in 
Section 3.4. 


Encryption. To encrypt a message m € {0,1}!, we proceed in three steps: 


1. We choose a random bit string r € {0,1}*. 

2. We set « = (m@ G(r))| (7 @ h(m @ G(r))). 
(If « ¢ D we return to step 1.) 

3. We compute c= f(z). 


As always, let | denote the concatenation of strings and @ the bitwise XOR 
operator. 

OAEP is an embedding scheme. The message m is embedded into the 
input x of f such that all bits of x depend on the bits of m. The length of 
the message m is |. Shorter messages are padded with some additional bits 
to get length |. The first / bits of z, namely m ® G(r) are obtained from m 
by masking with the pseudorandom bits G(r). The seed r is encoded in the 
last k bits masked with h(m©@ G(r)). The encryption depends on a randomly 
chosen r. Therefore, the resulting encryption scheme is not deterministic — 
encrypting a message m twice will produce different ciphertexts. 


Decryption. To decrypt a ciphertext c, we use the function f—!, the same 
pseudorandom random bit generator G and the same hash function h as 
above: 


1. Compute f~!(c) = allb, with |a| = 1 and |b| = k. 
2. Set r= h(a) @b and get m=a@Gir). 


To compute the plaintext m from the ciphertext c = f(x), an adversary 
must figure out all the bits of « from c = f(x). He needs the first J bits to 
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compute h(a) and the last k bits to get r. Therefore, an adversary cannot 
exploit any advantage from some partial knowledge of x. 

The OAEP scheme is published in [BelRog94]. OAEP has great practical 
importance. It has been adopted in PKCS#1 v2.0, a widely used standard 
which is implemented by Internet browsers and used in the secure socket layer 
protocol (SSL/TLS, [RFC 4346]). Using OAEP prevents Bleichenbacher’s at- 
tack, which we studied in the preceding section. Furthermore, OAEP is in- 
cluded in electronic payment protocols to encrypt credit card numbers, and 
it is part of the IKREE P1363 standard. 

For practical purposes, it is recommended to implement the hash function 
h and the random bit generator G using the secure hash algorithm SHA-1 
(see Section 3.4.2 below) or some other cryptographic hash algorithm which 
is considered secure (for details, see [BelRog94]). 

If h and G are implemented with efficient hash algorithms, the time to 
compute h and G is negligible compared to the time to compute f and f~!. 
Formatting with OAEP does not increase the length of the message substan- 
tially. 

Using OAEP with RSA encryption no longer preserves the multiplicative 
structure of numbers, and it is probabilistic. This prevents the previously 
discussed small-message-space attack. The low encryption exponent attack 
against RSA is also prevented, provided that the plaintext is individually 
re-encoded by OAEP for each recipient before it is encrypted with RSA. 

We explained the so-called basic OAEP scheme. A slight modification of 
the basic scheme is the following. Let k,! be as before and let k’ be another 
parameter, with n =1+k+k’'. We use a pseudorandom generator 


G: {0,1}* — {0,1}4* 
and a cryptographic hash function 
h: {0, 1}!+* —. {0,1}*. 


To encrypt a message m € {0,1}, we first append k’ 0-bits to m, and then 
we encrypt the extended message as before, i.e., we randomly choose a bit 
string r € {0,1}* and the encryption c of m is defined by 


c= f(((mI 0") & G(r))I(r @ h((mI0") & G(r)))). 


Here, we denote by 0* the constant bit string 000...0 of length k’. 

In [BelRog94], the modified scheme is proven to be secure in the random 
oracle model against adaptively-chosen-ciphertext attacks. The proof assumes 
that the hash function and the pseudorandom generator used behave like 
truly random functions. We describe the random oracle model in Section 
3.4.5. 

In [Shoup2001], it was observed that there is a gap in the security proof of 
OAEP. This does not imply that a particular instantiation of OAEP, such as 
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OAEP with RSA, is insecure. In the same paper, it is shown that OAEP with 
RSA is secure for an encryption exponent of 3. In [FujOkaPoiSte2001], this 
result is generalized to arbitrary encryption exponents. In Section 9.5.1, we 
describe SAEP — a simplified OAEP — and we give a security proof for SAEP 
in the random oracle model against adaptively-chosen-ciphertext attacks. 


3.4 Cryptographic Hash Functions 


Cryptographic hash functions such as SHA-1 or MD5 are widely used in 
cryptography. In digital signature schemes, messages are first hashed and 
the hash value h(m) is signed in place of m. Hash values are used to check 
the integrity of public keys. Pseudorandom bit strings are generated by hash 
functions. When used with a secret key, cryptographic hash functions become 
message authentication codes (MACs), the preferred tool in protocols like SSL 
and IPSec to check the integrity of a message and to authenticate the sender. 

A hash function is a function that takes as input an arbitrarily long string 
of bits (called a message) and outputs a bit string of a fixed length n. Math- 
ematically, a hash function is a function 


h: {0,1}* —> {0,1}", m= him). 


The length n of the output is typically between 128 and 512 bits®. Later, 
when discussing the birthday attack, we will see why the output lengths are 
in this range. 

One basic requirement is that the hash values h(m) are easy to compute, 
making both hardware and software implementations practical. 


3.4.1 Security Requirements for Hash Functions 


A classical application of cryptographic hash functions is the “encryption” 
of passwords. Rather than storing the cleartext of a user password pwd in 
the password file of a system, the hash value h(pwd) is stored in place of 
the password itself. If a user enters a password, the system computes the 
hash value of the entered password and compares it with the stored value. 
This technique of non-reversible “encryption” is applied in operating systems. 
It prevents, for example, passwords becoming known to privileged users of 
the system such as administrators, provided it is not possible to compute a 
password pwd from its hash value h(pwd). This leads to our first security 
requirement. 

A cryptographic hash function must be a one-way function: Given a value 
y € {0, 1}”, it is computationally infeasible to find an m with h(m) = y. 

If a hash function is used in conjunction with digital signature schemes, 
the message is hashed first and then the hash value is signed in place of 


® The output lengths of MD5 and SHA-1 are 128 and 160 bits. 
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the original message. Suppose Alice signs h(m) for a message m. Adversary 
Eve should have no chance to find a message m’ 4 m with h(m’) = h(m). 
Otherwise, she could pretend that Alice signed m’ instead of m. 

Thus, the hash function must have the property that given a message m, 
it is computationally infeasible to obtain a second message m’ with m 4 m’ 
and h(m) = h(m’). This property is called the second pre-image resistance. 

When using hash functions with digital signatures, we require an even 
stronger property. The legal user Alice of a signature scheme with hash func- 
tion h should have no chance of finding two distinct messages m and m’ with 
h(m) = h(m’). If Alice finds such messages, she could sign m and say later 
that she has signed m’ and not m. 

Such a pair (m,m’) of messages, with m 4 m’ and h(m) = h(m’), is 
called a collision of h. If it is computationally infeasible to find a collision 
(m,m’) of h, then h is called collision resistant. 

Sometimes, collision resistant hash functions are called collision free, but 
that’s misleading. The function h maps an infinite number of elements to a 
finite number of elements. Thus, there are lots of collisions (in fact, infinitely 
many). Collision resistance merely states that they cannot be found. 


Proposition 3.5. A collision-resistant hash function h is second-pre-image 
resistant. 


Proof. An algorithm computing second pre-images can be used to compute 
collisions in the following way: Choose m at random. Compute a pre-image 
m' £m of h(m). (m,m’) is a collision of h. 


Proposition 3.5 says that collision resistance is the stronger property. 
Therefore, second pre-image resistance is sometimes also called weak collision 
resistance, and collision resistance is referred to as strong collision resistance. 


Proposition 3.6. A second-pre-image-resistant hash function is a one-way 
function. 


Proof. If h were not one-way, there would be a practical algorithm A that on 
input of a randomly chosen value v computes a message m with h(m) = v, 
with a non-negligible probability. Given a random message m, attacker Eve 
could find, with a non-negligible probability, a second pre-image of h(m) in 
the following way: She applies A to the hash value h(m) and obtains m with 
h(m) = h(m). The probability that m 4 m is high. 


Our definitions and the argument in the previous proof lack some precision 
and are not mathematically rigorous. For example, we do not explain what 
“computationally infeasible’ and a “non-negligible probability” mean. It is 
possible to give precise definitions and a rigorous proof of Proposition 3.6 
(see Chapter 10, Exercise 2). 
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Definition 3.7. A hash function is called a cryptographic hash function if it 
is collision resistant. 


Sometimes, hash functions used in cryptography are referred to as one- 
way hash functions. We have seen that there is a stronger requirement, col- 
lision resistance, and the one-way property follows from it. Therefore, we 
prefer to speak of collision-resistant hash functions. 


3.4.2 Construction of Hash Functions 


Merkle-Damgard’s construction. There are no known examples of hash 
functions whose collision resistance can be proven without any assumptions. 
In Section 10.2, we give examples of (rather inefficient) hash functions that 
are provably collision resistant under standard assumptions in public-key 
cryptography, such as the factoring assumption. 

Many cryptographic hash functions used in practice are obtained by the 
following method, known as Merkle-Damgard’s construction or Merkle’s 
meta method. The method reduces the problem of constructing a collision- 
resistant hash function h : {0,1}* —> {0,1}” to the problem of constructing 
a collision-resistant function 


f:{0,1}"+" — {0,1}" (r EN, > 0) 


with finite domain {0,1}"*". Such a function f is called a compression func- 
tion. A compression function maps messages m of a fixed length n +r to 
messages f(m) of length n. We call r the compression rate. 
We discuss Merkle-Damgard’s construction. Let f : {0,1}"*" —> {0,1}” 
be a compression function with compression rate r. By using f, we define a 
hash function 
h: {0,1}* — {0,1}”. 


Let m € {0,1}* be a message of arbitrary length. The hash function h works 
iteratively. To compute the hash value h(m), we start with a fixed initial n- 
bit hash value v = vp (the same for all m). The message m is subdivided into 
blocks of length r. One block after the other is taken from m, concatenated 
with the current value v and compressed by f to get a new v. The final v is 
the hash value h(m). 

More precisely, we pad m out, i.e., we append some bits to m, to obtain 
a message m, whose bit length is a multiple of r. We apply the following 
padding method: A single 1-bit followed by as few (possibly zero) 0-bits as 
necessary are appended. Every message m is padded out with such a string 
100...0, even if the length of the original message m is a multiple of r. This 
guarantees that the padding can be removed unambiguously — the bits which 
are added during padding can be distinguished from the original message 
bits.? 
° There are also other padding methods which may be applied. See, for example, 

[RFC 3369]. 
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After the padding, we decompose 
m=m,|...|\mz, mE {0,1}, 1<i< k, 


into blocks m; of length r. 

We add one more r-bit block m,+41 to m and store the original length of 
m (i.e., the length of m before padding it out) into this block right-aligned. 
The remaining bits of m,+1 are filled with zeros: 


fin = my lal... mel mesa. 
Starting with the initial value vg € {0,1}", we set recursively 
vu; = f(u-ilmi), 1<i<k+1. 
The last value of v is taken as hash value h(m): 
A(m) := vpg41- 


The last block mz41 is added to prevent certain types of collisions. It 
might happen that we obtain vj; = vo for some 7. If we had not added 
Mp+i, then (m,m’) would be a collision of h, where m’ is obtained from 
mi+1|| --. || by removing the padding string 10...0 from the last block mg. 
Since m and m’ have different lengths, the additional length blocks differ and 
prevent such collisions.1° 


Proposition 3.8. Let f be a collision-resistant compression function. The 
hash function h constructed by Merkle’s meta method is also collision resis- 
tant. 


Proof. The proof runs by contradiction. Assume that h is not collision re- 
sistant, i.e., that we can efficiently find a collision (m,m’) of h. Let (m,m’) 
be the modified messages as above. The following algorithm efficiently com- 
putes a collision of f from (m,m’). This contradicts our assumption that f 
is collision resistant. 

10 Sometimes, the padding and the length block are combined: the length of the 


original message is stored into the rightmost bits of the padding string. See, for 
example, SHA-1 ([RFC 3174]). 
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Algorithm 3.9. 
collision FindCollision(bitString m, m’) 


1 m=my|...|me41,m' = m‘{ |... m),,,, decomposed as above 
2 V1,--+,Uk41; U1) +++, Ugr41 Constructed as above 

3. if |m| F |m’| 

4 then return (vz ||mx41, Vp | Mr 41) 

5 forz+1tokdo 

6 if vy, Av; and vj41 = Vijay 

7 then return (v;|\m:41, vj |mi,1) 

8 forz—Otok—I1do 

9 if Mi+1 x Mit 
10 then return (v;|)m:41, vj |mi,,) 


Note that h(m) = vpq1 = Vpr41 = A(m’). If |m| F |m’| we have mz41 F 
M41, Since the length of the string is encoded in the last block. Hence 
URIMer1 A VYy||M-4,- We obtain a collision (vp||Mp41, VY; lM41), because 
f(velmey1) = h(m) = h(m’) = f(r |lm4.41). On the other hand, if |m| = 
|m’|, then k = k’, and we are looking for an index i with v; A v; and uj41 = 
Viar (Vil miss, o{)mM4,,) is then a collision of f, because f(v;|mi41) = vi41 = 
Viner = f(villmj4.1). If no index with the above condition exists, we have 
vj = vj, 1 <i< k+1. In this case, we search for an index i with mj41 4 mj,4. 
Such an index exists, because m 4 m’. (v;|mi+1, v;|mMi,1) is then a collision 
of f, because f(vi|miz1) = Vid1 = Viz. = f(yj|mi4s)- 


~ 


a4 


~ 


The Birthday Attack. One of the main questions when designing a hash 
function h : {0,1}* —> {0,1}” is how large to choose the length n of the hash 
values. A lower bound for n is obtained by analyzing the birthday attack. 

The birthday attack is a brute-force attack against collision resistance. 
Adversary Eve randomly generates messages m,,™M2,™m3,.... For each newly 
generated message m;, Eve computes and stores the hash value h(m;) and 
compares it with the previous hash values. If h(m;) coincides with one of 
the previous hash values, h(m;) = h(m,;) for some 7 < i, Eve has found a 
collision (m;,m,;)!!. We show below that Eve can expect to find a collision 
after choosing about 2"/? messages. Thus, it is necessary to choose n so large 
that it is impossible to calculate and store 2”/? hash values. If n = 128 (as 
with MD5), about 2°* ~ 107° messages have to be chosen for a successful 
attack.!* Many people think that today a hash length of 128 bits is no longer 
large enough, and that 160 bits (as in SHA-1 and RIPEMD-160) should be 
the lower bound (also see Section 3.4.2 below). 


11 Tn practice, the messages are generated by a deterministic pseudorandom gener- 
ator. Therefore, the messages themselves can be reconstructed and need not be 
stored. 

2 To store 2° 16-byte hash values, you need 27° TB of storage. There are mem- 
oryless variations of the birthday attack which avoid these extreme storage re- 
quirements, see [MenOorVan96]. 
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Attacking second-pre-image resistance or the one-way property of h with 
brute force would mean to generate, for a given hash value v € {0,1}”, 
random messages ™1,™2,m3,... and check each time whether h(m;) = v. 
Here we expect to find a pre-image of v after choosing 2” messages (see 
Exercise 9). For n = 128, we need 2175 ~ 10%° messages. To protect against 
this attack, a smaller n would be sufficient. 

The surprising efficiency of the birthday attack is based on the birthday 
paradox. It says that the probability of two persons in a group sharing the 
same birthday is greater than 1/9, if the group is chosen at random and has 
more than 23 members. It is really surprising that this happens with such a 
small group. 

Considering hash functions, the 365 days of a year correspond to the 
number of hash values. We assume in our discussion that the hash function 
h: {0,1}* —> {0, 1}” behaves like the birthdays of people. Each of the s = 2” 
values has the same probability. This assumption is reasonable. It is a basic 
design principle that a cryptographic hash function comes close to a random 
function, which yields random and uniformly distributed values (see Section 
3.4.4). 

Evaluating h, k times with independently chosen inputs, the probability 
that no collisions occur is 


p= not) = 4 To-9= J] (1-2). 


i=0 i=l 


We have 1 — x < e™™® for all real numbers x and get 


k-1 
n< II evi/s — antfs) oes = eT h(k-1)/2s | 
t=1 


The probability that a collision occurs is 1 — p, and 1—p > 1/9 if k > 
Vo (V1 + 8In2-s4+1) © 1.18 V/s. 

For s = 365, we get an explanation for the original birthday paradox, 
since 1.18 - \/s = 22.54. 

For the hash function h, we conclude that it suffices to choose about 2”/? 
many messages at random to obtain a collision with probability > 1/9. 

In a hash-then-decrypt digital signature scheme, where the hash value is 
signed in place of the message (see Section 3.4.5 below), the birthday attack 
might be practically implemented in the following way. Suppose that Eve 
and Bob want to sign a contract m,. Later, Eve wants to say that Bob has 
signed a different contract mz. Eve generates O(2"/?) minor variations of m4 
and m2. In many cases, for example, if m, includes a bitmap, Bob might not 
observe the slight modification of m,. If the birthday attack is successful, 
Eve gets messages m, and m2 with h(m,) = h(m2). Eve lets Bob sign the 
contract ™m ,. Later, she can pretend that Bob signed mz. 
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Compression Functions from Block Ciphers. We show in this section 
how to derive compression functions from a block cipher, such as DES or 
AES. From these compression functions, cryptographic hash functions can 
be obtained by using Merkle-Damgard’s construction. 

Symmetric block ciphers are widely used and well studied. Encryption 
is implemented by efficient algorithms (see Chapter 2). It seems natural to 
also use them for the construction of compression functions. Though no rig- 
orous proofs exist, the hope is that a good block cipher will result in a good 
compression function. Let 


E: {0,1}" x {0,1}" —> {0,1}", (k,x) -> E(k, x) 


be the encryption function of a symmetric block cipher, which encrypts blocks 
x of bit length n with r-bit keys k. 

First, we consider constructions where the bit length of the hash value 
is equal to the block length of the block cipher. These schemes are called 
single-length MDCs'8. To obtain a collision-resistant compression function, 
the block length n of the block cipher should be at least 128 bits. 

The compression function 


fr: {0,1}"7" — {0,1}", (aly) > Ely, x) 


maps bit blocks of length n+ r to blocks of length n. A block of length n+r 
is split into a left block x of length n and a right block y of length r. The 
right block y is used as key to encrypt the left block x. 

The second example of a compression function — it is the basis of the 
Matyas-Meyer-Oseas hash function ([MatMeyOse85]) — has been included in 
[ISO/IEC 10118-2]. Its compression rate is n. A block of bit length 2n is 
split into two halves, x and y, each of length n. Then x is encrypted with a 
key g(y) which is derived from y by a function g : {0,1}” —> {0,1}".1* The 
resulting ciphertext is bitwise XORed with z: 


fa: {0,1}? — {0,1}", (aly) > E(g(y), ) @ &. 


If the block length of the block cipher is less than 128, double-length MDCs 
are used. Compression functions whose output length is twice the block length 
can be obtained by combining two types of the above compression functions 
(for details, see [MenOorVan96)]). 


Real Hash Functions. Most cryptographic hash functions used in practice 
today do not rely on other cryptographic primitives such as block ciphers. 
They are derived from custom-designed compression functions by applying 


'3 The acronym MDC is explained in Section 3.4.3 below. 

' For example, we can take g(y) = y, if the block length of E is equal to the key 
length, or, more generally, we can compute r key bits g(y) from y by using a 
Boolean function. 
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Merkle-Damgard’s construction. The functions are especially designed for the 
purpose of hashing, with performance efficiency in mind. 

In [Rivest90], R. Rivest proposed MD4, which is algorithm number 4 in a 
family of hash algorithms. MD4 was designed for software implementation on 
a 32-bit processor. The MD4 algorithm is not strong enough, as early attacks 
showed. However, the design principles of the MD4 algorithm were subse- 
quently used in the construction of hash functions. These functions are often 
called the MD4 family. The family contains the most popular hash functions 
in use today, such as MD5, SHA-1 and RIPEMD-160. The hash values of 
MD5 are 128 bits long, those of RIPEMD-160 and SHA-1 160 bits. All of 
these hash functions are iterative hash functions; they are constructed with 
Merkle-Damgard’s method. The compression rate of the underlying compres- 
sion functions is 512 bits. 

SHA-1 is included in the Secure Hash Standard FIPS 180 of NIST 
([RFC 1510]; [RFC 3174]). It is an improvement of SHA-0, which turned out 
to have a weakness. The standard was updated in 2002 ({FIPS 180-2]). Now 
it includes additional algorithms that produce 256-bit, 384-bit and 512-bit 
outputs. 

Since no rigorous mathematical proofs for the security of these hash func- 
tions exist, there is always the chance of a surprise attack. 

For example, the MD5 algorithm is very popular, but there have been 
very successful attacks. 

In 1996, H. Dobbertin detected collisions (vo|m, vol|m’) of the underly- 
ing compression function, where vp is a common 128-bit string and m,m/’ 
are distinct 512-bit messages ([Dobbertin96a]; [Dobbertin96]). Dobbertin’s 
ug is different from the initial value that is specified for MD5 in the Merkle- 
Damgard iteration. Otherwise, the collision would have immediately implied 
a collision of MD5 (note that the same length block is appended to both 
messages). Already in 1993, B. den Boer and A. Bosselaers had detected 
collisions of MD5’s compression function. Their collisions (vo||m, v6 ||m) were 
made of distinct initial values vg and the same message m. Thus, they did not 
fit Merkle-Damgard’s method and were sometimes called pseudocollisions. 

The recent attacks by the Chinese researchers X. Wang, D. Feng, X. Lai 
and H. Yu showed that MD5 can no longer be considered collision-resistant. 
In August 2004, they published collisions for the hash functions MD4, MD5, 
HAVAL-128, RIPEMD-128 ([WanFenLaiYu04]). V. Klima published an algo- 
rithm which works for any initial value and computes collisions of MD5 on a 
standard PC within a minute ([Klima06]). MD5 is really broken. 

Moreover, in February 2005, X. Wang, Y. L. Yin and H. Yu cast serious 
doubts on the security of SHA-1 ([WanYinYu05]). They announced that they 
found an algorithm which computes collisions with 2° hash operations. This 
is much less than the expected 2°° steps of the brute-force birthday attack. 
With current technology, 2° steps are still on the far edge of feasibility. For 
example, the RC5-64 Challenge was finished in 2002. A worldwide network of 
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Internet users was able to figure out a 64-bit RC5 key by a brute-force search. 
The search took almost 5 years, and more than 300,000 users participated 
(see [RSALabs]; [DistributedNet]). 

All of these attacks are against collision resistance, and they are rele- 
vant for digital signatures. They are not attacks against second-pre-image 
resistance or the one-way property. Therefore, applications like HMAC (see 
Section 3.4.3), whose security is based on these properties, are not yet af 
fected. 

In the future, hash functions with longer hash values, such as SHA-256 
or SHA-512, will be used in place of MD5 and SHA-1. 


3.4.3 Data Integrity and Message Authentication 


Modification Detection Codes. Cryptographic hash functions are also 
known as message digest functions, and the hash value h(m) of a message 
m is called the digest or fingerprint or thumbprint of m 1°. The hash value 
h(m) is indeed a “fingerprint” of m. It is a very compact representation of 
m, and, as an immediate consequence of second-pre-image resistance, this 
representation is practically unique. Since it is computationally infeasible to 
obtain a second message m’ with m 4 m’ and h(m’) = h(m), a different hash 
value would result, if the message m were altered in any way. 

This implies that a cryptographic hash function can be used to control 
the integrity of a message m. If the hash value of m is stored in a secure 
place, a modification of m can be detected by calculating the hash value and 
comparing it with the stored value. Therefore, hash functions are also called 
modification detection codes (MDCs). 

Let us consider an example. If you install a new root certificate in your 
Internet browser, you have to make sure (among other things) that the source 
of the certificate is the one you think it is and that your copy of the certificate 
was not modified. You can do this by checking the certificate’s thumbprint. 
For this purpose, you can get the fingerprint of the root certificate from the 
issuing certification authority’s web page or even on real paper by ordinary 
mail (certificates and certification authorities are discussed in Section 4.1.5). 


Message Authentication Codes. A very important application of hash 
functions is message authentication, which means to authenticate the origin 
of the message. At the same time, the integrity of the message is guaranteed. 
If hash functions are used for message authentication, they are called message 
authentication codes, or MACs for short. 

MACs are the standard symmetric technique for message authentication 
and integrity protection and widely used, for example, in protocols such as 
SSL/TLS ({[RFC 4346]) and IPSec. They depend on secret keys shared be- 
tween the communicating parties. In contrast to digital signatures, where 


' MD5 is a “message digest function”. 
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only one person knows the secret key and is able to generate the signature, 
each of the two parties can produce the valid MAC for a message. 

Formally, the secret keys k are used to parameterize hash functions. Thus, 
MACs are families of hash functions 


(he : {0, LS = {0, 1}" )kex: 


MACs may be derived from block ciphers or from cryptographic hash func- 
tions. We describe two methods to obtain MACs. 

The standard method to convert a cryptographic hash function into 
a MAC is called HMAC. It is published in [RFC 2104] (and [FIPS 198), 
[ISO/IEC 9797-2]) and can be applied to a hash function h that is derived 
from a compression function f by using Merkle-Damgard’s method. You can 
take as h, for example, MD5, SHA-1 or RIPEMD-160 (see Section 3.4.2). 

We have to assume that the compression rate of f and the length of the 
hash values are multiples of 8, so we can measure them in bytes. We denote 
by r the compression rate of f in bytes. The secret key k can be of any length, 
up to r bytes. By appending zero bytes, the key k& is extended to a length of 
r bytes (e.g., if & is of length 20 bytes and r = 64, then k will be appended 
with 44 zero bytes 0x00). 

Two fixed and distinct strings ipad and opad are defined (the ‘i’ and ‘o’ 
are mnemonics for inner and outer): 


ipad := the byte 0x36 repeated r times, 


opad := the byte 0x5C repeated r times. 
The keyed hash value HMAC of a message m is calculated as follows: 


HMAC(k, m) := h((k © opad)||h((k @ ipad)|m)). 


The hash function h is applied twice in order to guarantee the security of the 
MAC. If we apply h only once and define HMAC(k, m) := h((k ® ipad)||m), 
an adversary Eve could take a valid MAC value, modify the message m and 
compute the valid MAC value of the modified message, without knowing the 
secret key. For example, Eve may take any message m’ and compute the hash 
value v of m’ by applying Merkle-Damgard’s iteration with HMAC(k,m) = 
h((k@ipad)||m) as initial value vp. Before iterating, Eve appends the padding 
bits and the additional length block to m’. She does not store the length of 
m’ into the length block, but the length of mlm’, where m is the padded 
message m (including the length block for m, see Section 3.4.2). Then v is 
the MAC of the extended message ™m||m’, and Eve has computed it without 
knowing the secret key k. This problem is called the length extension problem 
of iterated hash functions. Applying the hash function twice prevents the 
length extension attack. 

MACs can also be constructed from block ciphers. The most important 
construction is CBC-MAC. Let E be the encryption function of a block ci- 
pher, such as DES or AES. Then, with & a secret key, the MAC value for 
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a message m is the last ciphertext block when encrypting m, with F in the 
Cipher-Block Chaining Mode CBC and key k (see Section 2.2.3). We need 
an initialization vector IV for CBC. For encryption purposes, it is impor- 
tant not to use the same value twice. Here, the IV is fixed and typically 
set to 0...0. If the block length of E is n, then m is split into blocks of 
length n, m1|mgl|... mz: (pad out the last block, if necessary, for example, 
by appending zeros), and we compute 


co i= ITV 
Cy = E(k, My QO Ci-1) 
CBC-MAC := ¢ 


Sometimes, the output of the CBC-MAC function is taken only to be a part 
of the last block. There are various standards for CBC-MAC, for example, 
[FIPS 113] and [ISO/TEC 9797-1]. A comprehensive discussion of hash func- 
tions and MACs can be found in [MenOorVan96]. 


3.4.4 Hash Functions as Random Functions 


A random function would be the perfect cryptographic hash function h. Ran- 
dom means that for all messages m, each of the n bits of the hash value h(m) 
is determined by tossing a coin. Such a perfect cryptographic hash function 
is also called a random oracle'®. Unfortunately, it is obvious that a perfect 
random oracle can not be implemented. To determine only the hash values 
for all messages of fixed length 1 would require exponentially many (n - 2’) 
coin tosses and storage of all the results, which is clearly impossible. 

Nevertheless, it is a design goal to construct hash functions which approxi- 
mate random functions. It should be computationally infeasible to distinguish 
the hash function from a truly random function. Recall that there is a sim- 
ilar design goal for symmetric encryption algorithms. The ciphertext should 
appear random to the attacker. That is the reason why we hoped in Section 
3.4.2 that a good block cipher induces a good compression function for a 
good hash function. 

If we assume that the designers of a hash function h have done a good job 
and h comes close to a random oracle, then we can use h as a generator of 
pseudorandom bits. Therefore, we often see popular cryptographic hash func- 
tions such as SHA-1 or MD5 as sources of pseudorandomness. For example, 
in the Transport Layer Security (TLS) protocol ({RFC 4346]), also known as 
Secure Socket Layer (SSL), client and server agree on a shared 48-bit mas- 
ter secret, and then they derive further key material (for example, the MAC 
keys, encryption keys and initialization vectors) from this master secret by 


'6 Security proofs in cryptography sometimes rely on the assumption that the hash 
function involved is a random oracle. An example of such a proof is given in 
Section 3.4.5 
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using a pseudorandom function. The pseudorandom function of TLS is based 
on the HMAC construction, with the hash functions SHA-1 or MD5. 


3.4.5 Signatures with Hash Functions 


Let (n,e) be the public RSA key and d be the secret decryption exponent 
of Alice. In the basic RSA signature scheme (see Section 3.3.2), Alice can 
sign messages that are encoded by numbers m € {0,...,2— 1}. To sign m, 
she applies the RSA decryption algorithm and obtains the signature 0 = 
m4 mod n of m. 

Typically, n is a 1024—bit number. Alice can sign a bit string m that, when 
interpreted as a number, is less than n. This is a text string of at most 128 
ASClIl-characters. Most documents are much larger, and we are not able to 
sign them with basic RSA. This problem, which exists in all digital signatures 
schemes, is commonly solved by applying a collision resistant hash function 
h. 

Message m is first hashed, and the hash value h(m) is signed in place of 
m. Alice’s RSA signature of m is 


c= h(m)4 mod n. 


To verify Alice’s signature o for message m, Bob checks whether 


o° = h(m) mod n. 

This way of generating signatures is called the hash-then-decrypt paradigm. 
This term is even used for signature schemes, where the signing algorithm is 
not the decryption algorithm as in RSA (see, for example, ElGamal’s Signa- 
ture Scheme in Section 3.5.2). 

Messages with the same hash value have the same signature. Collision 
resistance of h is essential for non-repudiation. It prevents Alice from first 
signing m and pretending later that she has signed a different message m’ and 
not m. To do this, Alice would have to generate a collision (m,m’). Collision 
resistance also prevents that an attacker Eve takes a signed message (m,0o) 
of Alice, generates another message m’ with the same hash value and uses 
o as a (valid) signature of Alice for m’. To protect against the latter attack, 
second-pre-image resistance of h would be sufficient. 

The hash-then-decrypt paradigm has two major advantages. Messages of 
any length can be signed by applying the basic signature algorithm, and 
the attacks, which we discussed in Section 3.3.2, are prevented. Recall that 
the hash function reduces a message of arbitrary length to a short digital 
fingerprint of less than 100 bytes. 

The schemes which we discuss now implement the hash-then-decrypt 
paradigm. 
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Full-Domain-Hash RSA signatures. We apply the hash-then-decrypt 
paradigm in an RSA signature scheme with public key (n,e) and secret key 
d and use a hash function 


h: {0,1}* — {0,...,n— 1}, 


whose values range through the full set {0,...,n —1} rather than a smaller 
subset. Such a hash function h is called a full-domain hash function, because 
the image of h is the full domain {0,...,n —1} of the RSA function!”. The 
signature of a message m € {0,1}* is h(m)4 mod n. 

The hash functions that are typically used in practical RSA schemes, like 
SHA, MD5 or RIPEMD, are not full-domain hash functions. They produce 
hash values of bit length between 128 and 512 bits, whereas the typical bit 
length of n is 1024 or 2048. 

It can be mathematically proven that full-domain-hash RSA signatures 
are secure in the random oracle model ([BelRog93]), and we will give such a 
proof in this section. 

For this purpose, we consider an adversary F', who attacks Bob, the le- 
gitimate owner of an RSA key pair, and tries to forge at least one signature 
of Bob, without knowing Bob’s private key d. More precisely, F' is an effi- 
ciently computable algorithm that, with some probability of success, on input 
of Bob’s public RSA key (n,e) outputs a message m together with a valid 
signature o of m. 


The random oracle model. In this model, the hash function h is assumed 
to operate as a random oracle. This means that 


1. the hash function h is a random function (as explained in Section 3.4.4), 
and 

2. whenever the adversary F' needs the hash value for a message m, it has 
to call the oracle h with m as input. Then it obtains the hash value h(m) 
from the oracle. 


Condition 2 means that F always calls h as a “black box” (for example, 
by calling it as a subroutine or by communicating with another computer 
program), whenever it needs a hash value, and this may appear as a trivial 
condition. But it includes, for example, that the adversary has no algorithm 
to compute the hash values by itself; it has no knowledge about the internal 
structure of h, and it is stateless with respect to hash values. It does not store 
and reuse any hash values from previous executions. The hash values h(m) 
appear as truly random values to him. 

We assume from now on that our full-domain hash function h is a random 
oracle. Given m, each element of Z,, has the same probability 1/, of being 
the hash value h(m). 

The security of RSA signatures relies, of course, on the RSA assumption, 
which states that the RSA function is a one-way function. Without knowing 


7 As often, we identify Z, with {0,...,n—1}. 
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the secret exponent d, it is infeasible to compute e-th roots modulo n, i.e., 
for a randomly chosen e-th power y = «© mod n, it is impossible to compute 
x from y with more than a negligible probability (see Definition 6.7 for a 
precise statement). 

Our security proof for full-domain-hash RSA signatures is a typical one. 
We develop an efficient algorithm A which attacks the underlying assumption 
— here the RSA assumption. In our example, A tries to compute the e-th 
root of a randomly chosen y € Z,. The algorithm A calls the forger F as a 
subroutine. If F is successful in its forgery, then A is successful in computing 
the e-th root. Now, we conclude: since it is infeasible to compute e-th roots 
(by the RSA assumption), F' can not be successful, i.e., it is impossible to 
forge signatures. By A, the security of the signature scheme is reduced to 
the security of the RSA trapdoor function. Therefore, such proofs are called 
security proofs by reduction. 

The security of full-domain-hash signatures is guaranteed, even if forger 
F is supplied with valid signatures for messages m’ of its choice. Of course, 
to be successful, F' has to produce a valid signature for a message m which 
is different from the messages m’. F' can request the signature for a message 
m’ at any time during its attack, and it can choose the messages m’ adap- 
tively, i-e., F can analyze the signatures that it has previously obtained, and 
then choose the next message to be signed. F performs an adaptively-chosen- 
message attack (see Section 10.1 for a more detailed discussion of the various 
types of attacks against signature schemes). 

In the real attack, the forger F interacts with Bob, the legitimate owner 
of the secret key, to obtain signatures, and with the random oracle to obtain 
hash values. Algorithm A is constructed to replace both, Bob and the random 
oracle h, in the attack. It “simulates” the signer Bob and h. 

Since A has no access to the secret key, it has a problem to produce a 
valid signature, when F' issues a signature request for message m’. Here, the 
random oracle model helps A. It is not the message that is signed, but its 
hash value. To check if a signature is valid, the forger F’ must know the hash 
value, and to get the hash value it has to ask the random oracle. Algorithm 
A answers in place of the oracle. If asked for the hash value of a message m’, 
it selects s € Z, at random and supplies s° as the hash value. Then, it can 
provide s as the valid signature of m’. 

Forger F' can not detect that A sends manipulated hash values. The ele- 
ments s, s© and the real hash values (generated by a random oracle) are all 
random and uniformly distributed elements of Z,. This means that forger 
F, when interacting with A, runs in the same probabilistic setting as in the 
real attack. Therefore, its probability of successfully forging a signature is 
the same as in the real attack against Bob. 

A takes as input the public key (n,e) and a random element y € Z,. Let 
F query the hash values of the r messages m1,™m2,...,m,. The structure of 
A is the following. 
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Algorithm 3.10. 

int A(int n,e, y) 

1 choose t € {1,...,r} at random and set hy — y 

2 choose s; € Z, at random and set hj — sf,i=1,...,r,i At 

3 call F(n,e) 

4 if F queries the hash value of m;, then respond with h; 
5 if F requests the signature of m;, i € t, then respond with s; 
6 if F requests the signature of m;, then terminate with failure 
7 if F requests the signature of m’,m’ 4 m; fori =1,...,r, 
8 then respond with a random element of Z,, 
9 if F returns (m,s), return s 


In step 1, A tries to guess the message m € {m,...,m,}, for which F 
will output a forged signature. F must know the hash value of m. Otherwise, 
the hash value h(m) of m would be randomly generated independently from 
F’s point of view. Then, the probability that a signature s, generated by 
F, satisfies the verification condition s* mod n = h(m) is 1/n, and hence 
negligibly small. Thus, m is necessarily one of the messages m;, for which F 
queries the hash value. 

If F requests the signature of m;, i 4 t, then A responds with the valid 
signature s; (line 5). If F requests the signature of m’,m’ #4 m,; for i = 
1,...,7, and F' never asks for the hash value of m’, then A can respond with 
a random value (line 7) — F is not able to check the validity of the answer. 

Suppose that A guesses the right m; in step 1 and F forges successfully. 
Then, F returns a valid signature s for m:, which is an e-th root of y, i.e., 
s© = h(m,) = y. In this case, A returns s. It has successfully computed an 
e-th root of y modulo n. 

The probability that A guesses the right m; in step 1 is 1/;. Hence, the 
success probability of A is 1/,-a, where a is the success probability of forger 
F. Assume for a moment that forger F' is always successful, i.e., a = 1. By 
independent repetitions of A, we then get an algorithm which successfully 
computes e-th roots with a probability close to 1. In general, we get an 
algorithm to compute e-th roots with about the same success probability a 
as F, 

We described the notion of provable security in the random oracle model 
by studying full-domain hash RSA signatures. The proof says that a suc- 
cessful forger can not exist in the random oracle model. But since real hash 
functions are never perfect random oracles (see Section 3.4.4 above), our 
argument can never be completed to a security proof of the real signature 
scheme, where a real implementation of the hash function has to be used. 

In Section 9.5, we will give a random-oracle proof for Boneh’s SAEP 
encryption scheme. 

Our proof requires a full-domain hash function h — it is essential that each 
element of Z,, has the same probability of being the hash value. The hash 
functions used in practice usually are not full-domain hash functions, as we 
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observed above. The scheme we describe in the next section does not rely on 
a full-domain hash function. It provides a clever embedding of the hashed 
message into the domain of the signature function. 


PSS. The probabilistic signature scheme (PSS) was introduced in [BelRog96]. 
The signature of a message depends on the message and some randomly cho- 
sen input. The resulting signature scheme is therefore probabilistic. To set 
up the scheme, we need the decryption function of a public-key cryptosys- 
tem like the RSA decryption function or the decryption function of Rabin’s 
cryptosystem (see Section 3.6). More generally, it requires a trapdoor permu- 
tation 
f:D—D,Dc {0,1}", 


a pseudorandom bit generator 
G: {0,1}! — {0,1}* x {0,1}"-), wt (Gi(w), Gow) 


and a hash function 
h: {0,1}* — {0,1}. 


The PSS is applicable to messages of arbitrary length. The message m 
cannot be recovered from the signature o. 


Signing. To sign a message m € {0,1}*, Alice proceeds in three steps: 


1. Alice chooses r € {0,1}* at random and calculates w := h(m|r). 

2. She computes G(w) = (Gi(w), Go(w)) and y := w|(Gi(w) 6 r)|Go(w). 
(If y € D, she returns to step 1.) 

3. The signature of m is o := f~1(y). 


As usual, || denotes the concatenation of strings and © the bitwise XOR 
operator. If Alice wants to sign message m, she concatenates a random seed 
r to the message and applies the hash function h to mr. Then Alice applies 
the generator G to the hash value w. The first part Gi(w) of G(w) is used 
to mask r; the second part of G(w), Go(w), is appended to w|Gi(w) @r to 
obtain a bit string y of appropriate length. All bits of y depend on the message 
m. The hope is that mapping m into the domain of f by m+—= y behaves 
like a truly random function. This assumption guarantees the security of the 
scheme. Finally, y is decrypted with f to get the signature. The random seed 
r is selected independently for each message m — signing a message twice 
yields distinct signatures. 


Verification. To verify the signature of a signed message (m,o), we use the 
same trapdoor function f, the same random bit generator G and the same 
hash function h as above, and proceed as follows: 


1. Compute f(a) and decompose f(c) = witlu, 
where |w| = 1, |¢| = k and |u| =n —(k+1). 
2. Compute r=t @ Gi(w). 
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3. We accept the signature o if h(m|r) = w and G2(w) = u; otherwise we 
reject it. 


PSS can be proven to be secure in the random oracle model under the 
RSA assumption. The proof assumes that the hash functions G and h are 
random oracles. 

For practical applications of the scheme, it is recommended to implement 
the hash function h and the random bit generator G with the secure hash 
algorithm SHA-1 or some other cryptographic hash algorithm that is con- 
sidered collision resistant. Typical values of the parameters n,k and | are 
n = 1024 bits and k = 1 = 128 bits. 


3.5 The Discrete Logarithm 


In Section 3.3 we discussed the RSA cryptosystem. The RSA function raises 
an element m to the e-th power. It is a bijective function and is efficient to 
compute. If the factorization of n is not known, there is no efficient algorithm 
for computing the e-th root. There are other functions in number theory 
that are easy to compute but hard to invert. One of the most important is 
exponentiation in finite fields. Let p be a prime and g be a primitive root in 
Zy, (see Appendix A.4). The discrete exponential function 


Exp: Zp_1 —> Z, 2+ g’, 

is a one-way function. It can be efficiently computed, for example, by the 
repeated squaring algorithm (Section 3.2). No efficient algorithm for com- 
puting the inverse function Log of Exp, i.e., for computing x from y = g”, is 
known, and it is widely believed that no such algorithm exists. This assump- 
tion is called the discrete logarithm assumption (for a precise definition, see 
Definition 6.1). 


3.5.1 ElGamal’s Encryption 


In contrast to the RSA function, Exp is a one-way function without a trap- 
door. It does not have any additional information, which makes the computa- 
tion of the inverse function easy. Nevertheless, Exp is the basis of ElGamal’s 
cryptosystem ({[ElGamal84]). 


Key Generation. The recipient of messages, Bob, proceeds as follows: 


1. He chooses a large prime p, such that p—1 has a big prime factor and a 
primitive root g € Zy. 

2. He chooses at random an integer x in the range 0 < x < p—2. 
The triple (p, g,) is the secret key of Bob. 

3. He computes y = g” in Z,. The public key of Bob is (p,g,y), and x is 
kept secret. 
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The number p—1 will have a large prime factor if Bob is looking for primes 
p of the form 2kq +1, where q is a large prime. Thus, Bob first chooses a 
large prime q. Here he proceeds in the same way as in the RSA key generation 
procedure (see Section 3.3.1). Then, to get p, Bob randomly generates a k of 
appropriate bit length and applies a probabilistic primality test to z = 2kq+1. 
He replaces &k by k+1 until he succeeds in finding a prime. He expects to test 
O(In z) numbers for primality before reaching the first prime (see Corollary 
A.71). Having found a prime p = 2kq + 1, he randomly selects elements g in 
Z;, and tests whether g is a primitive root. The factorization of k is required 
for this test (see Algorithm A.39). Thus g must be chosen to be sufficiently 
large, such that k is small enough to be factored efficiently. 

Bob has to avoid that all prime factors of p—1 are small. Otherwise, there 
is an efficient algorithm for the computation of discrete logarithms developed 
by Silver, Pohlig and Hellman (see [Koblitz94]). 


Encryption and Decryption. Alice encrypts messages for Bob by using 
Bob’s public key (p,g,y). She can encrypt elements m € Z,. To encrypt a 
message m € Z,, Alice chooses at random an integer k,l <k < p— 2. The 
encrypted message is the following pair (ci,c2) of elements in Z,: 


(c1, C2) = (9*, y*m). 


The computations are done in Z,. By multiplying m with y*, Alice hides the 
message m behind the random element y*. 

Bob decrypts a ciphertext (c1,c2) by using his secret key a. Since y* = 
(g*)* = (g*)* = c?, he obtains the plaintext m by multiplying cy with the 
inverse c;” of c?: 

Cy = y *ykm =m. 
Recall that cy” = ch-'~*, because c{~' = [1] (see “Computing modulo a 
prime” on page 303). Therefore, Bob can decrypt the ciphertext by raising 
c, to the (p—1—«)-th power, m = c2~'~* cy. Note that p—1—z is a positive 
number. 

The encryption algorithm is not a deterministic algorithm. The cryp- 
togram depends on the message, the public key and on a randomly chosen 
number. If the random number is chosen independently for each message, it 
rarely happens that two plaintexts lead to the same ciphertext. 

The security of the scheme depends on the following assumption: it is im- 
possible to compute g?* (and hence g~** = (g**)~! and m) from g® and g*, 
which is called the Diffie-Hellman problem. An efficient algorithm to compute 
discrete logarithms would solve the Diffie-Hellman problem. It is unknown 
whether the Diffie-Hellman problem is equivalent to computing discrete log- 
arithms, but it is believed that no efficient algorithm exists for this problem 
(also see Section 4.1.2). 

Like basic RSA (see Section 3.3.3), ElGamal’s encryption is vulnerable to 
a chosen-ciphertext attack. Adversary Eve, who wants to decrypt a ciphertext 
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c= (cq, ¢2), with c = g* and cg = my*, chooses random elements k and ™ 
and gets Bob to decrypt ¢ = (c1g®, comy*). Bob sends mm, the plaintext 
of € = (gk+® mmy*t*), to Eve. Eve simply divides by m and obtains the 
plaintext m of c: m = (mm)m~'. Bob’s suspicion is not aroused, because the 


plaintext mm looks random to him. 


3.5.2 ElGamal’s Signature Scheme 


Key Generation. To generate a key for signing, Alice proceeds as Bob in 
the key generation procedure above to obtain a public key (p,g,y) and a 
secret key (p,g,x) with y = g’. 


Signing. We assume that the message m to be signed is an element in Z,. 
In practice, a hash function h is used to map the messages into Z,. Then 
the hash value is signed. The signed message is produced by Alice using the 
following steps: 


1. She selects a random integer k, 1 < k < p— 2, with gcd(k, p— 1) = 1. 
2. She sets r:= g* and s := k~!(m— raz) mod (p— 1). 
3. (m,r,s) is the signed message. 


Verification. Bob verifies the signed message (m,7r, 5) as follows: 


1. He verifies whether 1 <r < p—1. If not, he rejects the signature. 
2. He computes v := g™ and w:= y’r*, where y is Alice’s public key. 
3. The signature is accepted if v = w; otherwise it is rejected. 


Proposition 3.11. If Alice signed the message (m,r,s), we have v = w. 


Proof. 


w= yrs a (g”)"(g*)® = gr @ghh Gn—re) = g” Ze 


Here, recall that exponents of g can be reduced modulo (p— 1), since g?~! = 
[1] (see “Computing modulo a prime” on page 303). 


Remarks. The following observations concern the security of the system: 


1. The security of the system depends on the discrete logarithm assump- 

tion. Someone who can compute discrete logarithms can get everyone’s 
secret key and thereby break the system totally. To find an s, such that 
g” =y’r*, on given inputs m and r, is equivalent to the computation of 
discrete logarithms. 
To forge a signature for a message m, one has to find elements r and s, 
such that g™” = y’r®. It is not known whether this problem is equivalent 
to the computation of discrete logarithms. However, it is also believed 
that no efficient algorithm for this problem exists. 
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2. If adversary Eve succeeds in getting the chosen random number k for 
some signed message m, she can compute rz = (m—sk) mod (p—1) and 
the secret key x, because with high probability gcd(r,p — 1) = 1. Thus, 
the random number generator used to get & must be of superior quality. 

3. It is absolutely necessary to choose a new random number for each mes- 
sage. If the same random number is used for different messages m1 4 Ma, 
it is possible to compute k: s — s’ = (m—m’)k~! mod (p— 1) and hence 
k= (s—s')~'(m—m’) mod (p— 1). 

4. When used without a hash function, ElGamal’s signature scheme is exis- 

tentially forgeable; i.e., an adversary Eve can construct a message m and 
a valid signature (m,r,s) for m. 
This is easily done. Let b and c be numbers such that gced(c,p — 1) = 1. 
Set r = g’y°, s = —rc~! mod (p—1) and m = —rbc~! mod (p—1). Then 
(m,r, 8) satisfies g” = y’r*. Fortunately in practice, as observed above, 
a hash function h is applied to the original message, and it is the hash 
value that is signed. Thus, to forge the signature for a real message is 
not so easy. Adversary Eve has to find some meaningful message m with 
h(m) = m. If h is a collision-resistant hash function, her probability of 
accomplishing this is very low. 

5. D. Bleichenbacher observed in [Bleichenbacher96] that step 1 in the veri- 
fication procedure is essential. Otherwise Eve would be able to sign mes- 
sages of her choice, provided she knows one valid signature (m, r,s), where 
m is a unit in Zp_1. 

Let m’ be a message of Eve’s choice, u = m’m' mod (p — 1), 

s’ = sumod(p— 1), r € Z, such that r’ = rmodp and 

r’ = rumod (p — 1). r’ is obtained by the Chinese Remainder Theo- 

rem (see Theorem A.29). Then (m’,r’, s’) is accepted by the verification 

procedure. 


3.5.3 Digital Signature Algorithm 


In 1991 NIST proposed a digital signature standard (DSS) (see [NIST94]). 
DSS was intended to become a standard digital signature method for use 
by government and financial organizations. The DSS contains the digital 
signature algorithm (DSA), which is very similar to ElGamal’s algorithm. 


Key Generation. The keys are generated in a similar way as in ElGamal’s 
signature scheme. As above, a prime p, an element g € Z; and an exponent 
x are chosen. x is kept secret, whereas p,g and y = g® are published. The 
difference is that g is not a primitive root in Z), but an element of order gq, 
where q is a prime divisor of p—1.!® Moreover, the binary size of q is required 
to be 160 bits. 


To generate a public and a secret key, Alice proceeds as follows: 


‘8 The order of g is the smallest e € N with g® = [1]. The order of a primitive root 
in Z; is p—1. 
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1. She chooses a 160-bit prime g and a prime p, such that q divides p — 1 
(p should have the binary length |p| = 512 + 64t,0 < t < 8). She can do 
this in a way analogous to the key generation in ElGamal’s encryption 
scheme. First, she selects gq at random, and then she looks for a prime p 
in {2kq+1,2(k + 1)g+1,2(k+2)q+1,...}, with & a randomly chosen 
number of appropriate size. 

2. To get an element g of order q, she selects elements h € Z) at random 
until g := h—)/4 ¥ [1]. Then g has order q, and it generates the unique 
cyclic group G, of order qin Z;.'? Note that in Gy elements are computed 
modulo p and exponents are computed modulo q.?° 

3. Finally, she chooses an integer x in the range 1 < x < q— 1 at random. 

4. (p,q,g,2) is the secret key, and the public key is (p,q, g,y), with y := g*. 


Signing. Messages m to be signed by DSA must be elements in Z,. In DSS, 
a hash function h is used to map real messages to elements of Z,. The signed 
message is produced by Alice using the following steps: 


1. She selects a random integer k, 1 <k<q-1. 

2. She sets r := (g* mod p) mod q and s := k~!(m + ra) mod q. If s = 0, 
she returns to step 1, but it is extremely unlikely that this occurs. 

3. (m,r, 8) is the signed message. 


Recall the verification condition of ElGamal’s signature scheme. It says 
that (m,7, 8) with * = g* mod p and § = k~!(m — Fx) mod (p — 1) can be 
verified by 


yi = g™ mod p.?! 


Now suppose that, as in DSA, § is defined by use of (m+ fax), § = (m+ 
rz)k~' mod (p — 1), and not by use of (m — fx), as in ElGamal’s scheme. 
Then the equation remains valid if we replace the exponent 7 of y by —7r: 


y "r= g™ mod p. (3.1) 


In the DSA, g and hence # and y are elements of order q in Z,- Thus, we can 
replace the exponents § and 7 in (3.1) by § mod q = s and * mod q = Tr. So, 
we have the idea that a verification condition for (r,s) may be derived from 
(3.1) by reducing 7 and § modulo q. This is not so easy, because the exponent 
* also appears as a base on the left-hand side of (3.1). The base cannot be 
reduced without destroying the equality. To overcome this difficulty, we first 
transform (3.1) to 


r= g™y" mod p. 


19 There is a unique subgroup G of order q of Z5. Gg consists of the unit element 
and all elements « € Z;, of order gq. It is cyclic and each member except the unit 
element is a generator, see Lemma A.40. 

20 See “Computing modulo a prime” on page 303. 

21 We write “mod p” to make clear that computations are done in Zp- 
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Now, the idea of DSA is to remove the exponentiation on the left-hand side. 
This is possible because s is a unit in = For t = s~! mod gq, we get 


Now we can reduce by modulo g on both sides and obtain the verification 
condition of DSA: 


r=Fmodq= ((g™y")' mod p) mod g. 
A complete proof of the verification condition is given below (Proposition 
3.12). Note that the exponentiations on the right-hand side are done in Zp. 
Verification. Bob verifies the signed message (m,r,s) as follows: 


1. He verifies that 1 <r <q—1and1<s<q-—1; if not, then he rejects 
the signature. 

2. He computes the inverse t := s~! of s modulo gq and v := ((g™y")! mod 
p) mod q, where y is the public key of Alice. 

3. The signature is accepted if v = r; otherwise it is rejected. 


Proposition 3.12. If Alice signed the message (m,r,s), we have v =r. 
Proof. 


= ((g"y")' mod p) mod gq = (g""*g"*! mod p) mod q 
25 es )tmod @ mod p) mod q = (g* mod p) mod q 


=f. 


Note that exponents can be reduced by modulo gq, because g4 = 1 mod p. 
Remarks: 


1. Compared with ElGamal, the DSA has the advantage that signatures are 
fairly short, consisting of two 160-bit numbers. 

2. In DSA, most computations — in particular the exponentiations — take 
place in the field Z>. The security of DSA depends on the difficulty of 
computing discrete logarithms. So it relies on the discrete logarithm as- 
sumption. This assumption says that it is infeasible to compute the dis- 
crete logarithm zx of an element y = g* randomly chosen from Z;,, where p 
is a sufficiently large prime and g is a primitive root of Z, (see Definition 
6.1 for a precise statement). Here, as in some cryptographic protocols 
discussed in Chapter 4 (commitments, electronic elections and digital 
cash), the base g is not a primitive root (with order p— 1), but an ele- 
ment of order q, where q is a large prime divisor of p—1. To get the secret 
key x, it would suffice to find discrete logarithms for random elements 
y = g° from the much smaller subgroup G, generated by g. Thus, the se- 
curity of DSA (and some protocols discussed in Chapter 4) requires the 
(widely believed) stronger assumption that finding discrete logarithms 
for elements randomly chosen from the subgroup G, is infeasible. 
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3. The remarks on the security of ElGamal’s signature scheme also apply 
to DSA and DSS. 

4. In the DSS, messages are first hashed before signed by DSA. The DSS 
suggests taking SHA-1 for the hash function. 


3.6 Modular Squaring 


Breaking the RSA cryptosystem might be easier than solving the factoring 
problem. It is widely believed to be equivalent to factoring, but no proof 
of this assumption exists. Rabin proposed a cryptosystem whose underlying 
encryption algorithm is provably as difficult to break as the factorization of 
large numbers (see [Rabin79]). 


3.6.1 Rabin’s Encryption 


Rabin’s system is based on the modular squaring function 


Square : Z, —> Zn, m*> m?. 
This is a one-way function with trapdoor, if the factoring of large numbers 
is assumed to be infeasible (see Section 3.2.2). 


Key Generation. As in the RSA scheme, we randomly choose two large 
distinct primes, p and q, for Bob. The scheme works with arbitrary primes, 
but primes p and q, such that p,q = 3 mod 4 speed up the decryption algo- 
rithm. Such primes are found as in the RSA key generation procedure. We 
are looking for primes p and q of the form 4k +3 in order to get the condition 
p,q = 3mod 4. Then n = pq is used as the public key and p and q are used 
as secret key. 


Encryption and Decryption. We suppose that the messages to be en- 
crypted are elements in Z,,. The modular squaring one-way function is used 
as the encryption function E: 


E: Zn — Zn, MH m?. 

To decrypt a ciphertext c, Bob has to compute the square roots of c in Zp. 
The computation of modular square roots is discussed in detail in Appendix 
A.7. For example, square roots modulo n can be efficiently computed, if and 
only if the factors of n can be efficiently computed. Bob can compute the 
square roots of c because he knows the secret key p and q. 
Using the Chinese Remainder Theorem (Theorem A.29), he decomposes 
Lane 

@:Zy, — Zp x Zy, c+ (ce mod p,c mod q). 


Then he computes the square roots of c mod p and the square roots of c mod 
q. Combining the solutions, he gets the square roots modulo n. If p divides 
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c (or q divides c), then the only square root modulo p (or modulo gq) is 0. 
Otherwise, there are two distinct square roots modulo p and modulo q. Since 
the primes are = 3 mod 4, the square roots can be computed by one modular 
exponentiation (see Algorithm A.61). Bob combines the square roots modulo 
p and modulo q by using the Chinese Remainder Theorem, and obtains four 
distinct square roots of c (or two roots in the rare case that p or gq divides c, or 
the only root 0 if c = 0). He has to decide which of the square roots represents 
the plaintext. There are different approaches. If the message is written in some 
natural language, it should be easy to choose the right one. If the messages are 
unstructured, one way to solve the problem is to add additional information. 
The sender, Alice, might add a header to each message consisting of the 
Jacobian symbol (2) and the sign bit b of m. The sign bit b is defined as 0 
if 0 <m < /9, and 1 otherwise. Now Bob can easily determine the correct 
square root (see Proposition A.66). 


Remark. The difficulty of computing square roots modulo n is equivalent to 
the difficulty of computing the factors of n (see Proposition A.64). Hence, 
Rabin’s encryption resists ciphertext-only attacks as long as factoring is im- 
possible. The basic scheme, however, is completely insecure against a chosen- 
ciphertext attack. If adversary Eve can use the decryption algorithm as a 
black box, she can determine the secret key using the following attack. She 
selects m in the range 0 < m <n and computes c = m? mod n. Decrypting ¢ 
delivers y. There is a 50% chance that m # +y mod n, and in this case Eve 
can easily compute the prime factors of n from m and y (see Lemma A.63). 
If the Jacobian symbol (2) is added to each message m as sketched above, 
Eve may choose m with (2) = —1, but add +1 to the header. Then she 
will be certain to get a square root y with m # +y. Applying some proper 
formatting, as in the OAEP scheme, can prevent this attack. 


3.6.2 Rabin’s Signature Scheme 


The decryption function in Rabin’s cryptosystem is only applicable to quad- 
ratic residues modulo n. Therefore, the system is not immediately applicable 
as a digital signature scheme. Before applying the decryption function, we 
usually apply a hash function to the message to be signed. Here we need a 
collision-resistant hash function whose values are quadratic residues modulo 
n. Such a hash function is obtained by the following construction. Let M be 
the message space and let 


h: M x {0,1}* —Z,, (m,x)-—> h(m,z) 


be a hash function. To sign a message m, Bob generates pseudorandom bits 
x using a pseudorandom bit generator and computes h(m, x). Knowing the 
factors of n, he can easily test whether z := h(m,x) € QR,,. z is a square if 
and only if z mod p and z mod gq are squares, and z mod p is a square in Zp, 
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if and only if z®-!)/2 = 1 mod p (Proposition A.52). He repeatedly chooses 
pseudorandom bit strings x until h(m, x) is a square in Z,,. Then he computes 
a square root y of h(m,) (e.g. using Proposition A.62). The signed message 
is defined as 

(m,2,y). 


A signed message (m, 2, y) is verified by testing 
h(m,x) = y?. 


If an adversary Eve is able to make Bob sign hash values of her choice, she 
can figure out Bob’s secret key (see above). 


Exercises 


1. Set up an RSA encryption scheme by generating a pair of public and 
secret keys. Choose a suitable plaintext and a ciphertext. Encrypt and 
decrypt them. 


2. Let n denote the product of two distinct primes p and q, and let e € N. 
Show that e is prime to y(n) if 


i Ze — Zr, xr x 
is bijective. 
3. Let RSA. : Z) — Zt, x +> 2°. Show that 


{a € Z* | RSA. (x) = x}| = gcd(e — 1,p—1)- gcd(e —1,q—1). 


Hint: Show that |{x € Z> | x* = 1}| = ged(k,p — 1), where p is a prime, 
and use the Chinese Remainder Theorem (Theorem A.29). 


4. Let (n,e) be the public key of an RSA cryptosystem and d be the associ- 
ated decryption exponent. Construct an efficient probabilistic algorithm 
A which on input n,e and d computes the prime factors p and q of n 
with very high probability. 
Hint: Use the idea of the Miller-Rabin primality test, especially case 2 in 
the proof of Proposition A.78. 


5. Consider RSA encryption. Discuss, in detail, the advantage you get using, 
for encryption, a public exponent which has only two 1 digits in its binary 
encoding and using, for decryption, the Chinese Remainder Theorem. 


6. Let p,p’,g and q’ be prime numbers, with p’ 4 qd’, p= ap'+1,q= bq +1 
and n := pq: 
a. Show |{x € Z5 | p’ does not divide ord(x)}| = a. 


7. 


10. 


11. 
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b. Assume that p’ and q’ are large (compared to a and 6). Let x € Z 
be a randomly chosen element. Show that the probability that x has 
large order is > 1 — (1/p’ + 1/q! — 1/p’q’). More precisely, show 
{x € Z* | p'd’ does not divide ord(x)}| = a(q— 1) + b(p — 1) — ab. 


Consider RSA encryption RSA, : Z* —- Z*, 

a. Show that RSA! = idz. for some | € N. 

b. Consider the decryption-by-iterated-encryption attack (see Section 
3.3.1). Let p,p',p”,¢,q° and q” be prime numbers, with p’ 4 q’, 
p=ap+1,q=bd +1, p' =a'p"+1,d = b'q' +1, n := pq and 
n’ := pq’. Assume that p’ and q’ are large (compared to a and b) 
and that p” and q” are large (compared to a’ and 0’). (This means 
that the factors of n satisfy the conditions 1 and 3 required for strong 
primes.) 
Show that the number of iterations necessary to decrypt a ciphertext 
cis > pq” (and thus very large) for all but an exponentially small 
fraction of ciphertexts. By exponentially small, we mean < 27!"I/k 


(k, constant). 


Br x: 


Let p be a large prime, such that q := (P — 1)/2 is also prime. Let Gq be 
the subgroup of order q in Z;. Let g and h be randomly chosen generators 
in G,. We assume that it is infeasible to compute discrete logarithms in 
Gy. Show that 


f:{0,...,¢-1}? — G,, (x,y) > g®h¥ 


can be used to obtain a collision-resistant compression function. 


Let h: {0,1}* —> {0,1}” be a cryptographic hash function. We assume 
that the hash values are uniformly distributed, i-e., each value v € {0,1}” 
has the same probability 1/9». How many steps do you expect until you 
succeed with the brute-force attacks against the one-way property and 
second pre-image resistance? 


Set up an ElGamal encryption scheme by generating a pair of public and 
secret keys. 
a. Choose a suitable plaintext and a ciphertext. Encrypt and decrypt 
them. 
b. Generate ElGamal signatures for suitable messages. Verify the sig- 
natures. 
c. Forge a signature without using the secret key. 
d. Play the role of an adversary Eve, who learns the random number k 
used to generate a signature, and break the system. 
e. Demonstrate that checking the condition 1 < r < p—1 is necessary 
in the verification of a signature (r,s). 


Weak generators (see [Bleichenbacher96]; [MenOorVan96}). 
Let p be a prime, p = 1mod4, and g € Z, such that g modp is a 


80 


12. 


13. 
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primitive root in Zi. Let (p,g,x) be Bob’s secret key and (p,g,y = 9”) 
be Bob’s public key in an ElGamal signature scheme. We assume that: (1) 
p—1= gt and (2) discrete logarithms can be efficiently computed in the 
subgroup H of order g in Z> (e.g. using the Pohlig-Hellman algorithm). 
To sign a message m, adversary Eve does the following: (1) she sets 
r = t; (2) she computes z, such that g’* = y*; and (3) she sets s = 
$(p — 3)(m — tz) mod (p — 1). 
Show: 

a. That it is possible to compute z in step 2. 

b. That (r,s) is accepted as Bob’s signature for m. 

c. How the attack can be prevented. 


Set up a DSA signature scheme by generating a pair of public and secret 
keys. Generate and verify signatures for suitable messages. Take small 
primes p and q. 


Set up a Rabin encryption scheme by generating a pair of public and 
secret keys. Choose a suitable plaintext and a ciphertext. Encrypt and 
decrypt them. Then play the role of an adversary Eve, who succeeds in 
a chosen-ciphertext attack and recovers the secret key. 


4. Cryptographic Protocols 


One of the major contributions of modern cryptography is the development of 
advanced protocols providing high-level cryptographic services, such as secure 
user identification, voting schemes and digital cash. Cryptographic protocols 
use cryptographic mechanisms — such as encryption algorithms and digital 
signature schemes — as basic components. 

A protocol is a multi-party algorithm which requires at least two par- 
ticipating parties. Therefore, the algorithm is distributed and invoked in at 
least two different places. An algorithm that is not distributed, is not called 
a protocol. The parties of the algorithm must communicate with each other 
to complete the task. The communication is described by the messages to be 
exchanged between the parties. These are referred to as the communication 
interface. The protocol requires precise definitions of the interface and the 
actions to be taken by each party. 

A party participating in a protocol must fulfill the syntax of the com- 
munication interface, since not following the syntax would be immediately 
detected by the other parties. The party can behave honestly and follow the 
behavior specified in the protocol. Or she can behave dishonestly, only ful- 
fill the syntax of the communication interface and do completely different 
things otherwise. These points must be taken into account when designing 
cryptographic protocols. 


4.1 Key Exchange and Entity Authentication 


Public- and secret-key cryptosystems assume that the participating parties 
have access to keys. In practice, one can only apply these systems if the 
problem of distributing the keys is solved. 

The security concept for keys, which we describe below, has two levels. 
The first level embraces long-lived, secret keys, called master keys. The keys 
of the second level are associated with a session, and are called session keys. 
A session key is only valid for the short time of the duration of a session. The 
master keys are usually keys of a public-key cryptosystem. 

There are two main reasons for the two-level concept. The first is that 
symmetric key encryption is more efficient than public-key encryption. Thus, 
session keys are usually keys of a symmetric cryptosystem, and these keys 
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must be exchanged in a secure way, by using other keys. The second, probably 
more important reason is that the two-level concept provides more security. 
If a session key is compromised, it affects only that session; other sessions 
in the past or in the future are not concerned. Given one session key, the 
number of ciphertexts available for cryptanalysis is limited. Session keys are 
generated when actually required and discarded after use; they need not be 
stored. Thus, there is no need to protect a large amount of stored keys. 

A master key is used for the generation of session keys. Special care is 
taken to prevent attacks on the master key. The access to the master key is 
severely limited. It is possible to store the master key on protected hardware, 
accessible only via a secure interface. The main focus of this section is to 
describe how to establish a session key between two parties. 

Besides key exchange, we introduce entity authentication. Entity authenti- 
cation prevents impersonation. By entity authentication, Alice can convince 
her communication partner Bob that, in the current communication, her 
identity is as declared. This might be achieved, for example, if Alice signs a 
specific message. Alice proves her identity by her signature on the message. If 
an adversary Eve intercepts the message signed by Alice, she can use it later 
to authenticate herself as Alice. Such attacks are called replay attacks. A re- 
play attack can be prevented if the message to be signed by Alice varies. For 
this purpose we introduce two methods. In the first method, Alice puts Bob’s 
name and a time stamp into the message she signs. Bob accepts a message 
only if it appears the first time. The second method uses random numbers. A 
random number is chosen by Bob and transmitted to Alice. Then Alice puts 
the random number into the message, signs the message and returns it to 
Bob. Bob can check that the returned random number matches the random 
number he sent and the validity of Alice’s signature. The random number 
is viewed as a challenge. Bob sends a challenge to Alice and Alice returns a 
response to Bob’s challenge. We speak of challenge-response identification. 

Some of the protocols we discuss provide both entity authentication and 
key exchange. 


4.1.1 Kerberos 


Kerberos denotes the distributed authentication service originating from 
MIT’s project Athena. Here we use the term Kerberos in a restricted way: we 
define it as the underlying protocol that provides both entity authentication 
and key establishment, by use of symmetric cryptography and a trusted third 
party. 

We continue with our description in [NeuTs’094]. In that overview article, 
a simplified version of the basic protocol is introduced to make the basic 
principles clear. Kerberos is designed to authenticate clients who try to get 
access to servers in a network. A central role is played by a trusted server 
called the Kerberos authentication server. 


4.1 Key Exchange and Entity Authentication 83 


The Kerberos authentication server T shares a secret key of a symmetric 
key encryption scheme F with each client A and each server B in the network, 
denoted by k4 and kg respectively. Now, assume that client A wants to access 
server B. At the outset, A and B do not share any secrets. The execution of 
the Kerberos protocol involves A, B and T, and proceeds as follows: 

Client A sends a request to the authentication server T, requesting creden- 
tials for server B. T responds with these credentials. The credentials consist 
of: 


1. A ticket t for the server B, encrypted with B’s secret key kp. 
2. A session key k, encrypted with A’s key ky. 


The ticket t contains A’s identity and a copy of the session key. It is intended 
for B. A will forward it to B. The ticket is encrypted with kg, which is 
known only to B and T. Thus, it is not possible for A to modify the ticket 
without being detected. A creates an authenticator which also contains A’s 
identity, encrypts it with the session key k (by this encryption A proves to 
B that she knows the session key embedded in the ticket) and transmits 
the ticket and the authenticator to B. B trusts the ticket (it is encrypted 
with kg, hence it originates from T), decrypts it and gets k. Now B uses 
the session key to decrypt the authenticator. If B succeeds, he is convinced 
that A encrypted the authenticator, because only A and the trusted T can 
know k. Thus A is authenticated to B. Optionally, the session key k can also 
be used to authenticate B to A. Finally, & may be used to encrypt further 
communication between the two parties in the current session. 

Kerberos protects the ticket and the authenticator against modification by 
encrypting it. Thus, the encryption algorithm E is assumed to have built-in 
data integrity mechanisms. 


Protocol 4.1. 
Basic Kerberos authentication protocol (simplified): 


1. A chooses r4 at random! and sends (A, B,r) to T. 

2. T generates a new session key k, and creates a ticket 
t := (A,k,l). Here | defines a validity period (consisting of a 
starting and an ending time) for the ticket. T sends 
(Ex, (k, ra, l, B), Ex, (t)) to A. 

3. A recovers k,r,4,l and B, and verifies that r4 and B match those 
sent in step 1. Then she creates an authenticator 
a:= (A,ta), where t, is a time stamp from A’s local clock, and 
sends (E;,(a), Ex, (t)) to B. 

4. B recovers t = (A,k,l) and a = (A,ta), and checks that: 

a. The identifier A in the ticket matches the one in the authen- 
ticator. 


' In this chapter all random choices are with respect to the uniform distribution. 
All elements have the same probability (see Appendix B.1). 
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b. The time stamp ty is fresh, i.e., within a small time interval 
around B’s local time. 
c. The time stamp t, is in the validity period J. 
If all checks pass, B considers the authentication of A as success- 
ful. 
If, in addition, B is to be authenticated to A, steps 5 and 6 are 
executed: 
5. B takes t4 and sends Ex(t,) to A. 
6. A recovers t4 from E;,(t4) and checks whether it matches with 
the t4 sent in step 3. If yes, she considers B as authenticated. 


Remarks: 


1. 


In step 1, a random number is included in the request. It is used to 
match the response in step 2 with the request. This ensures that the 
Kerberos authentication server is alive and created the response. In step 
3, a time stamp is included in the request to the server. This prevents 
replay attacks of such requests. To avoid perfect time synchronization, a 
small time interval around B’s local time (called a window) is used. The 
server accepts the request if its time stamp is in the current window and 
appears the first time. The use of time stamps means that the network 
must provide secure and synchronized clocks. Modifications of local time 
clocks by adversaries must be prevented to guarantee the security of the 
protocol. 


. The validity period of a ticket allows the reuse of the ticket in that period. 


Then steps 1 and 2 in the protocol can be omitted. The client can use 
the ticket t to repeatedly get access to the server for which the ticket was 
issued. Each time, she creates a new authenticator and executes steps 3 
and 4 (or steps 3-6) of the protocol. 

Kerberos is a popular authentication service. Version 5 of Kerberos (the 
current version) was specified in [RFC 1510]. Kerberos is based in part 
on Needham and Schroeder’s trusted third-party authentication protocol 
[NeeSch78]. 

In the non-basic version of Kerberos, the authentication server is only 
used to get tickets for the ticket-granting server. These tickets are called 
ticket-granting tickets. The ticket-granting server is a specialized server, 
granting tickets (server tickets) for the other servers (the ticket-granting 
server must have access to the servers’ secret keys, so usually the authen- 
tication server and the ticket granting server run on the same host). 
Client A executes steps 1 and 2 of Protocol 4.1 with the authentication 
server in order to obtain a ticket-granting ticket. Then A uses the ticket- 
granting ticket — more precisely, the session key included in the ticket 
granting ticket — to authenticate herself to the ticket-granting server and 
to get server tickets. The ticket-granting ticket can be reused during its 
validity period for the intended ticket-granting server. As long as the 
same ticket-granting ticket is used, the client’s secret key k4 is not used 
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again. This reduces the risk of exposing k4. We get a three-level key 
scheme. The first embraces the long-lived secret keys of the participating 
clients and servers. The keys of the second level are the session keys of 
the ticket-granting tickets, and the keys of the third level are the session 
keys of the server tickets. 

A ticket-granting ticket is verified by the ticket-granting server in the 
same way as any other ticket (see above). The ticket-granting server de- 
crypts the ticket, extracts the session key and decrypts the authenticator 
with the session key. The client uses a ticket from the ticket-granting 
server as in the basic model to authenticate to a service in the network. 


4.1.2 Diffie-Hellman Key Agreement 


Diffie-Hellman key agreement (also called exponential key exchange) pro- 
vided the first practical solution to the key distribution problem. It is based 
on public-key cryptography. W. Diffie and M.E. Hellman published their 
fundamental technique of key exchange together with the idea of public-key 
cryptography in the famous paper, “New Directions in Cryptography”, in 
1976 in [DifHel76]. Exponential key exchange enables two parties that have 
never communicated before to establish a mutual secret key by exchanging 
messages over a public channel. However, the scheme only resists passive 
adversaries. 

Let p be a sufficiently large prime, such that it is intractable to compute 
discrete logarithms in Z;. Let g be a primitive root in Z>. p and g are publicly 
known. Alice and Bob can establish a secret shared key by executing the 
following protocol: 


Protocol 4.2. 
Diffie-Hellman key agreement: 


1. Alice chooses a, 0 < a < p—2, at random, sets c := g% and sends 
c to Bob. 

2. Bob chooses b, 0 < b < p— 2, at random, sets d:= g’ and sends 
d to Alice. 

3. Alice computes the shared key k = d* = (g°)*. 

4. Bob computes the shared key k = c? = (g%)?. 


Remarks: 


1. If an attacker can compute discrete logarithms in Z>, he can compute a 
from c and then k = d*. However, to get the secret key k, it would suffice 
to compute g@ from g® and g’. This problem is called the Diffie-Hellman 
problem. The security of the protocol relies on the assumption that no 
efficient algorithms exist to solve this problem. This assumption is called 
the Diffie-Hellman assumption. It implies the discrete logarithm assump- 
tion. For certain primes, the Diffie-Hellman and the discrete logarithm 
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assumption have been proven to be equivalent ([Boer88]; [Maurer94]; 
[MauWol96]; [MauWol98]; [MauWol2000)}). 

2. Alice and Bob can use the randomly chosen element k = g® € Zi, as 

their session key. The Diffie-Hellman assumption does not guarantee that 
individual bits or groups of bits of k cannot be efficiently derived from 
g® and g? — this would be a stronger assumption. 
It is recommended to make prime p 1024 bits long. Usually, the length 
of a session key in a symmetric key encryption scheme is much smaller, 
say 128 bits. If we take, for example, the 128 most-significant bits of k 
as a session key k, then k is hard to compute from g® and g’. However, 
we do not know that all the individual bits of k are secure (on the other 
hand, none of the more significant bits is known to be easy to compute). 
In [BonVen96] it is shown that computing the \/|p| most-significant bits 
of g* from g® and g° is as difficult as computing g® from g® and g?. For 
a 1024-bit prime p, this result implies that the 32 most-significant bits of 
g®” are hard to compute. The problem of finding a more secure random 
session key can be solved by applying an appropriate hash function h to 
g*, and taking k = h(g*). 

3. This protocol provides protection against passive adversaries. An active 
attacker Eve can intercept the message sent to Bob by Alice and then play 
Bob’s role. The protocol does not provide authentication of the opposite 
party. Combined with authentication techniques, the Diffie-Hellman key 
agreement can be used in practice (see Section 4.1.4). 


4.1.3 Key Exchange and Mutual Authentication 


The problem of spontaneous key exchange, like Diffie-Hellman’s key agree- 
ment, is the authenticity of the communication partners in an open net- 
work. Entity authentication (also called entity identification) guarantees the 
identity of the communicating parties in the current communication ses- 
sion, thereby preventing impersonation. Mutual entity authentication re- 
quires some mutual secret, which has been exchanged previously, or access to 
predistributed authentic material, like the public keys of a digital signature 
scheme. 

The protocol we describe is similar to the X.509 strong three-way au- 
thentication protocol (see [ISO/TEC 9594-8]). It provides entity authentica- 
tion and key distribution, two different cryptographic mechanisms. The term 
“strong” distinguishes the protocol from simpler password-based schemes. 
To set up the scheme, a public-key encryption scheme (£, D) and a digital 
signature scheme (Sign, Verify) are needed. Each user Alice has a key pair 
(ea, da) for encryption and another key pair (s4,va) for digital signatures. 
It is assumed that everyone has access to Alice’s authentic public keys, ea 
and va, for encryption and the verification of signatures. 
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Executing the following protocol, Alice and Bob establish a secret session 
key. Furthermore, the protocol guarantees the mutual authenticity of the 
communication parties. 


Protocol 4.3. 
Strong three-way authentication: 


1. Alice chooses ra at random, sets t; := (B,ra) (where B repre- 
sents Bob’s identity), s1 := Signs, (ti) and sends (t1, 51) to Bob. 

2. Bob verifies Alice’s signature, checks that he is the intended 
recipient, chooses rp and a session key k at random, encrypts 
the session key with Alice’s public key, c := E.,(k), sets tg := 
(A,ra,rp,c), signs tg to get so := Signs,(t2) and sends (ta, s2) 
to Alice. 

3. Alice verifies Bobs signature, checks that she is the intended re- 
cipient and that the ra she received matches the ra from step 1 
(this prevents replay attacks). If both verifications pass, she is 
convinced that her communication partner is Bob. Now Alice de- 
crypts the session key k, sets tz := (B,rg), 53 := Signs, (t3) and 
sends (3, 83) to Bob. 

4. Bob verifies Alice’s signature and checks that the rg he received 
matches the rg from step 2 (this again prevents replay attacks). 
If both verifications pass, Bob and Alice use k as the session key. 


Remarks: 


1. The protocol identifies the communication partner by checking that she 

possesses the secret key of the signature scheme. The check is done by 
the challenge-response principle. First the challenge, a random number, 
used only once, is submitted. If the partner can return a signature of this 
random number, he necessarily possesses the secret key, thus proving his 
identity. The messages exchanged (the communication tokens) are signed 
by the sender and contain the recipient. This guarantees that the token 
was constructed for the intended recipient by the sender. Three messages 
are exchanged in the above protocol. Therefore, it is called the three-way 
or three-pass authentication protocol. 
There are also two-way authentication protocols. To prevent replay at- 
tacks, the communication tokens must be stored. Since these commu- 
nication tokens have to be deleted from time to time, they are given a 
time stamp and an expiration time. This requires a network with secure 
and synchronized clocks. A three-way protocol requires more messages 
to be exchanged, but avoids storing tokens and maintaining secure and 
synchronized clocks. 

2. The session key is encrypted with a public-key cryptosystem. Suppose 
adversary Eve records all the data that Alice and Bob have exchanged, 
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hoping that Alice’s secret encryption key will be compromised in the fu- 
ture. If Eve really gets Alice’s secret key, she can decrypt the data of all 
sessions she recorded. A key-exchange scheme which resists this attack 
is said to have forward secrecy. The Diffie-Hellman key agreement does 
not encrypt a session key. Thus, the session key cannot be revealed by a 
compromised secret key. If we combine the Diffie-Hellman key agreement 
with the authentication technique of the previous protocol, as in Sec- 
tion 4.1.4, we achieve a key-exchange protocol with authentication and 
forward secrecy. 


4.1.4 Station-to-Station Protocol 


The station-to-station protocol, combines Diffie-Hellman key agreement with 
authentication. It goes back to earlier work on ISDN telephone security, as 
outlined by W. Diffie in [Diffie88], in which the protocol is executed between 
two ISDN telephones (stations). The station-to-station protocol enables two 
parties to establish a shared secret key & to be used in a symmetric encryption 
algorithm &. Additionally, it provides mutual authentication. 

Let p be a prime such that it is intractable to compute discrete logarithms 
in Z>. Let g be a primitive root in Z;. p and g are publicly known. Further, 
we assume that a digital signature scheme (Sign, Verify) is available. Each 
user Alice has a key pair (sq,va) for this signature scheme. sa is the secret 
key for signing and va is the public key for verifying Alice’s signatures. We 
assume that each party has access to authentic copies of the other’s public 
key. 

Alice and Bob can establish a secret shared key and authenticate each 
other if they execute the following protocol: 


Protocol 4.4. 
Station-to-station protocol: 


1. Alice chooses a, 0 < a < p—2, at random, sets c := g* and sends 
c to Bob. 

2. Bob chooses b, 0 < b < p— 2, at random, computes the shared 
secret key k = g%, takes his secret key sp and signs the con- 
catenation of g* and g? to get s := Signs,(g“||g°). Then he sends 
(g°, Ex(s)) to Alice. 

3. Alice computes the shared key k = g%”, decrypts E;,(s) and ver- 
ifies Bob’s signature. If this verification succeeds, Alice is con- 
vinced that the opposite party is Bob. She takes her secret key 
sa, generates the signature s := Signs, (g°|g*) and sends Ex(s) 
to Bob. 

4. Bob decrypts E;,(s) and verifies Alice’s signature. If the verifica- 
tion succeeds, Bob accepts that he actually shares k with Alice. 
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Remarks: 


1. Both Alice and Bob contribute to the random strings g|g? and g?|g°. 
Thus each string can serve as a challenge. 

2. Encrypting the signatures with the key / guarantees that the party who 
signed also knows the secret key k. 


4.1.5 Public-Key Management Techniques 


In the public-key-based key-exchange protocols discussed in the previous sec- 
tions, we assumed that each party has access to the other parties’ (authentic) 
public keys. This requirement can be met by public-key management tech- 
niques. A trusted third party C is needed, similar to the Kerberos authenti- 
cation server in the Kerberos protocol. However, in contrast to Kerberos, the 
authentication transactions do not include an online communication with C. 
C prepares information in advance, which is then available to Alice and Bob 
during the execution of the authentication protocol. We say that C is offline. 
Offline third parties reduce network traffic, which is advantageous in large 
networks. 


Certification Authority. The offline trusted party is referred to as a cer- 
tification authority. Her tasks are: 


1. To verify the authenticity of the entity to be associated with a public 
key. 

2. To bind a public key to a distinguished name and to register it. 

3. (Optionally) to generate a party’s private and public key. 


Certificates play a fundamental role. They enable the storage and forward- 
ing of public keys over insecure media, without danger of undetected manipu- 
lation. Certificates are signed by a certification authority, using a public-key 
signature scheme. Everyone knows the certification authority’s public key. 
The authenticity of this key may be provided by non-cryptographic means, 
such as couriers or personal acquisition. Another method would be to publish 
the key in all newspapers. The public key of the certification authority can 
be used to verify certificates signed by the certification authority. Certificates 
prove the binding of a public key to a distinguished name. The signature of 
the certification authority protects the certificate against undetected manip- 
ulation. We list some of the most important data stored on a certificate: 


1. A distinguished name (the real name or a pseudonym of the owner of the 
certificate). 

The owner’s public key. 

The name of the certification authority. 

A serial number identifying the certificate. 

A validity period of the certificate. 


CL en 
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Creation of Certificates. If Alice wants to get a certificate, she goes to 
a certification authority C. To prove her identity, Alice shows her passport. 
Now, Alice needs public- and private-key pairs for encryption and digital 
signatures. Alice can generate the key pair and hand over a copy of the 
public key to C. This alternative might be taken if Alice uses a smart card to 
store her secret key. Smart cards often involve key generation functionality. 
If the keys are generated inside the smart card, the private key never leaves 
it. This reduces the risk of theft. Another model is that C generates the key 
pair and transmits the secret key to Alice. The transmission requires a secret 
channel. C' must of course be trustworthy, because she has the opportunity 
to steal the secret key. After the key generation, C puts the public key on 
the certificate, together with all the other necessary information, and signs 
the certificate with her secret key. 


Storing Certificates. Alice can take her certificate and store it at home. 
She provides the certificate to others when needed, for example for signature 
verification. A better solution in an open system is to provide a certificate 
directory, and to store the certificates there. The certificate directory is a 
(distributed) database, usually maintained by the certification authority. It 
enables the search and retrieval of certificates. 


Usage of Certificates. If Bob wants to encrypt a message for Alice or to 
verify a signature allegedly produced by Alice, he retrieves Alice’s certificate 
from the certificate directory (or from Alice) and verifies the certification 
authority’s signature. If the verification is successful, he can be sure that he 
really receives Alice’s public key from the certificate and can use it. 

For reasons of operational efficiency, multiple certification authorities are 
needed in large networks. If Alice and Bob belong to different certification 
authorities, Bob must access an authentic copy of the public key of Alice’s 
certification authority. This is possible if Bob’s certification authority Cp has 
issued a certificate for Alice’s certification authority Ca. Then Bob retrieves 
the certificate for Ca, verifies it and can then trust the public key of Ca. 
Now he can retrieve and verify Alice’s certificate. 

It is not necessary that each certification authority issues a certificate 
for each other certification authority in the network. We may use a directed 
graph as a model. The vertices correspond to the certification authorities, 
and an edge from Ca to Cg corresponds to a certificate of Ca for Cg. Then, 
a directed path should connect any two certification authorities. This is the 
minimal requirement which guarantees that each user in the network can 
verify each other user’s certificate. 

However, the chaining of authentications may reduce the trust in the final 
result: the more people you have to trust, the greater the risk that you have 
a cheater in the group. 


Revocation of Certificates. If Alice’s secret key is compromised, the cor- 
responding public key can no longer be used for encrypting messages. If the 
key is used in a signature scheme, Alice can no longer sign messages with 
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this key. Moreover, it should be possible for Alice to deny all signatures pro- 
duced with this key from then on. Therefore, the fact that Alice’s secret key 
is compromised must be publicized. Of course, the certification authority will 
remove Alice’s certificate from the certificate directory. However, certificates 
may have been retrieved before and may not yet have expired. It is not pos- 
sible to notify all users possessing copies of Alice’s certificate: they are not 
known to the certification authority. A solution to this problem is to main- 
tain certificate revocation lists. A certificate revocation list is a list of entries 
corresponding to revoked certificates. To guarantee authenticity, the list is 
signed by the certification authority. 


4.2 Identification Schemes 


There are many situations where it is necessary to “prove” one’s identity. 
Typical scenarios are to login to a computer, to get access to an account 
for electronic banking or to withdraw money from an automatic teller ma- 
chine. Older methods use passwords or PINs to implement user identification. 
Though successfully used in certain environments, these methods also have 
weaknesses. For example, anyone to whom you must give your password to 
be verified has the ability to use that password and impersonate you. Zero- 
knowledge (and other) identification schemes provide a new type of user 
identification. It is possible for you to authenticate yourself without giving 
to the authenticator the ability to impersonate you. We will see that very 
efficient implementations of such schemes exist. 


4.2.1 Interactive Proof Systems 


There are two participants in an interactive proof system, the prover and the 
verifier. It is common to call the prover Peggy and the verifier Vic. Peggy 
knows some fact (e.g. a secret key sk of a public-key cryptosystem or a 
square root of a quadratic residue x), which we call the prover’s secret. In an 
interactive proof of knowledge, Peggy wishes to convince Vic that she knows 
the prover’s secret. Peggy and Vic communicate with each other through a 
communication channel. Peggy and Vic alternately perform moves consisting 
of: 


1. Receive a message from the opposite party. 
2. Perform some computation. 
3. Send a message to the opposite party. 


Usually, Peggy starts and Vic finishes the protocol. In the first move, Peggy 
does not receive a message. The interactive proof may consist of several 
rounds. This means that the protocol specifies a sequence of moves, and 
this sequence is repeated a specified number of times. Typically, a move con- 
sists of a challenge by Vic and a response by Peggy. Vic accepts or rejects 
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Peggy’s proof, depending on whether Peggy successfully answers all of Vic’s 
challenges. 

Proofs in interactive proof systems are quite different from proofs in math- 
ematics. In mathematics, the prover of some theorem can sit down and prove 
the statement by himself. In interactive proof systems, there are two com- 
putational tasks, namely producing a proof (Peggy’s task) and verifying its 
validity (Vic’s task). Additionally, communication between the prover and 
verifier is necessary. 

We have the following requirements for interactive proof systems. 


1. (Knowledge) completeness. If Peggy knows the prover’s secret, then Vic 
will always accept Peggy’s proof. 

2. (Knowledge) soundness. If Peggy can convince Vic with reasonable prob- 
ability, then she knows the prover’s secret. 


If the prover and the verifier of an interactive proof system follow the 
behavior specified in the protocol, they are called an honest verifier and an 
honest prover. A prover who does not know the prover’s secret and tries to 
convince the verifier is called a cheating or dishonest prover. A verifier who 
does not follow the behavior specified in the protocol is called a cheating 
or dishonest verifier. Sometimes, the verifier can get additional information 
from the prover if he does not follow the protocol. Note that each prover (or 
verifier), whether she is honest or not, fulfills the syntax of the communication 
interface, because not following the syntax is immediately detected. She may 
only be dishonest in her private computations and the resulting data that 
she transmits. 


Password Scheme. In a simple password scheme, Peggy uses a secret pass- 
word to prove her identity. The password is the only message, and it is sent 
from the prover Peggy to the verifier Vic. Vic accepts Peggy’s identity if 
the transmitted password and the stored password are equal. Here, only one 
message is transmitted, and obviously the scheme meets the requirements. If 
Peggy knows the password, Vic accepts. If a cheating prover Eve does not 
know the password, Vic does not accept. The problem is that everyone who 
observed the password during communication can use the password. 


Identification Based on Public-Key Encryption. In Section 4.1, we 
considered an identification scheme based on a public-key cryptosystem. We 
recall the basic scenario. Each user Peggy has a secret key sk only known 
to her and a public key pk known to everyone. Suppose that everyone who 
can decrypt a randomly chosen encrypted message must know the secret key. 
This assumption should be true if the cryptosystem is secure. Hence, the 
secret key sk can be used to identify Peggy. 
Peggy proves her identity to Vic using the following steps: 


1. Vic chooses a random message m, encrypts it with the public key pk and 
sends the cryptogram c to Peggy. 
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2. Peggy decrypts c with her secret key sk and sends the result m’ back to 
Vic. 
3. Vic accepts the identity of Peggy if and only if m =m’. 


Two messages are exchanged: it is a two-move protocol. The complete- 
ness of the scheme is obvious. On the other hand, a cheating prover who only 
knows the public key and a ciphertext should not be able to find the plain- 
text better than guessing at random. The probability that Vic accepts if the 
prover does not know the prover’s secret is very small. Thus, the scheme is 
also sound. This reflects Vic’s security requirements. Suppose that an adver- 
sary Eve observed the exchanged messages and later wants to impersonate 
Peggy. Vic chooses m at random and computes c. The probability of obtain- 
ing the previously observed c is very small. Thus, Eve cannot take advantage 
of observing the exchanged messages. At first glance, everything seems to 
be all right. However, there is a security problem if Vic is not honest and 
does not follow the protocol in step 1. If, instead of a randomly chosen en- 
crypted message, he sends a cryptogram intended for Peggy, then he lets 
Peggy decrypt the cryptogram. He thereby manages to get the plaintext of 
a cryptogram which he could not compute by himself. This violates Peggy’s 
security requirements. 


4.2.2 Simplified Fiat-Shamir Identification Scheme 


Let n := pq, where p and q are distinct primes. As usual, QR,, denotes the 
subgroup of squares in Z* (see Definition A.48). Let x € QR,,, and let y bea 
square root of x. The modulus n and the square x are made public, while the 
prime factors p,q and y are kept secret. The square root y of x is the secret 
of prover Peggy. Here we assume that it is intractable to compute a square 
root of x, without knowing the prime factors p and q. This is guaranteed by 
the factoring assumption (see Definition 6.9) if p and q are sufficiently large 
randomly chosen primes, and z is also randomly chosen. Note that the ability 
to compute square roots is equivalent to the ability to factorize n (Proposition 
A.64). We assume that Peggy chooses n and y, computes x = y” and publishes 
the public key (n, x) to all participants. y is Peggy’s secret. Then Peggy may 
prove her identity by an interactive proof of knowledge by proving that she 
knows a square root y of x. The computations are done in Z*. 


Protocol 4.5. 
Fiat-Shamir identification (simplified): 


1. Peggy chooses r € Z* at random and sets a := r?. Peggy sends 
a to Vic. 

2. Vic chooses e € {0,1} at random. Vic sends e to Peggy. 

3. Peggy computes b := ry® and sends b to Vic, i.e., Peggy sends r 
ife=0, and ry ife=1. 

4. Vic accepts if and only if 6? = az®. 


94 4. Cryptographic Protocols 


In the protocol, three messages are exchanged — it is a three-move protocol: 


1. The first message is a commitment by Peggy that she knows a square 
root of a. 

2. The second message is a challenge by Vic. If Vic sends e = 0, then Peggy 
has to open the commitment and reveal r. If e = 1, she has to show her 
secret in encrypted form (by revealing ry). 

3. The third message is Peggy’s response to the challenge of Vic. 


Completeness. If Peggy knows y, and both Peggy and Vic follow the pro- 
tocol, then the response b = ry® is a square root of ax®, and Vic will accept. 


Soundness. A cheating prover Eve can convince Vic with a probability of 
1/g if she behaves as follows: 


1. Eve chooses r € Z* and é € {0,1} at random, and sets a:= r?x~°. 
Eve sends a to Vic. 
2. Vic chooses e € {0,1} at random. Vic sends e to Eve. 


3. Eve sends r to Vic. 


If e = é, Vic accepts. The event e = é occurs with a probability of 1/9. 
Thus, Eve succeeds in cheating with a probability of 1/9. 

On the other hand, !/2 is the best probability of success that a cheating 
prover can reach. Namely, assume that a cheating prover Eve can convince 
Vic with a probability > 1/9. Then Eve knows an a for which she can correctly 
answer both challenges. This means that Eve can compute b; and bg, such 
that 

b] =a and b3 = az. 


Hence, she can compute the square root y = bob; ~' of x. Recall that x is 
a random quadratic residue. Thus Eve has an algorithm A that on input 
x € QR,, outputs a square root y of z. Then Eve can use A to factor n (see 
Proposition A.64). This contradicts our assumption that the factorization of 
n is intractable. 


Security. We have to discuss the security of the scheme from the prover’s 
and from the verifier’s points of view. 

The verifier accepts the proof of a cheating prover with a probability 
of 1/9. The large probability of success of a cheating prover is too high in 
practice. It might be decreased by performing t rounds, i.e., by iterating 
the basic protocol ¢ times sequentially and independently. In this way, the 
probability of cheating is reduced to 27. In Section 4.2.4 we will give a 
generalized version of the protocol, which decreases the probability of success 
of a cheating prover. 

Now we look at the basic protocol from an honest prover’s point of view, 
and study Peggy’s security requirements. Vic chooses his challenges from 
the small set {0,1}. He has no chance of producing side effects, as in the 
identification scheme based on public-key cryptography given above. The 
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only information Peggy “communicates” to Vic is the fact that she knows a 
square root of x. The protocol has the zero-knowledge property studied in 
Section 4.2.3. 


4.2.3 Zero-Knowledge 


In the interactive proof system based on a public-key cryptosystem, which 
we discussed above, a dishonest verifier Vic can decrypt Peggy’s cryptograms 
by interacting with Peggy. Since Vic is not able to decrypt them without 
interaction, he learns something new by interacting with Peggy. He obtains 
knowledge from Peggy. This is not desirable, because it might violate Peggy’s 
security requirements as our example shows. It is desirable that interactive 
proof systems are designed so that no knowledge is transferred from the prover 
to the verifier. Such proof systems are called zero-knowledge. Informally, an 
interactive proof system is called zero-knowledge if whatever the verifier can 
efficiently compute after interacting with the prover, can be efficiently simu- 
lated without interaction. Below we define the zero-knowledge property more 
formally. 

We denote the algorithm that the honest prover Peggy executes by P, 
the algorithm of an honest verifier by V and the algorithm of a general 
(possibly dishonest) verifier by V*. The interactive proof system (including 
the interaction between P and V) is denoted by (P, V). Peggy knows a secret 
about some object x (e.g. as in the Fiat-Shamir example in Protocol 4.5, the 
root of a square x). This object x is the common input to P and V. 

Each algorithm is assumed to have polynomial running time. It may be 
partly controlled by random events, i.e., it has access to a source of random 
bits and thus can make random choices. Such algorithms are called proba- 
bilistic algorithms. We study this notion in detail in Chapter 5. 

Let x be the common input of (P, V). Suppose, the interactive proof takes 
n moves. A message is sent in each move. For simplicity, we assume that the 
prover starts with the first move. We denote by m; the message sent in the 
i-th move. The messages m ,,™m3,... are sent from the prover to the verifier 
and the messages m2,™m4,... are sent from the verifier to the prover. The 
transcript of the joint computation of P and V* on input x is defined by 


trp,y«(x) = (m1,...,™Mn), 


where trp,y«(a) is called an accepting transcript if V* accepts after the last 
move. Note that the transcript trp,y«(x) depends on the random bits that 
the algorithms P and V* choose. Thus, it is not determined by the input x. 


Definition 4.6. An interactive proof system (P,V) is (perfect) zero-know- 
ledge if there is a probabilistic simulator S(V*, a), running in expected poly- 
nomial time, which for every verifier V* (honest or not) outputs on input « an 
accepting transcript t of P and V*, such that these simulated transcripts are 
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distributed in the same way as if they were generated by the honest prover 
P and V*. 


Remark. The definition of zero-knowledge includes all verifiers (also the dis- 
honest ones). Hence, zero-knowledge is a property of the prover P. It captures 
the prover’s security requirements against attempts to gain “knowledge” by 
interacting with him. 

To understand the definition, we have to clarify what a simulator is. A 
simulator S is an algorithm which, given some verifier V*, honest or not, gen- 
erates valid accepting transcripts for (P,V*), without communicating with 
the real prover P. In particular, S does not have any access to computations 
that rely on the prover’s secret. Trying to produce an accepting transcript, 
S plays the role of P in the protocol and communicates with V*. Thus, he 
obtains outgoing messages of V* which are compliant with the protocol. His 
task is to fill into the transcript the messages going out from P. Since P com- 
putes these messages by use of her secret and S does not know this secret, 
S applies his own strategy to generate the messages. Necessarily, his proba- 
bility of obtaining a valid transcript in this way is significantly less than 1. 
Otherwise, with high probability, S could falsely convince V* that he knows 
the secret, and the proof system is not sound. Thus, not every attempt of S 
to produce an accepting transcript is successful; he fails in many cases. Nev- 
ertheless, by repeating his attempts sufficiently often, the simulator is able 
to generate a valid accepting transcript. It is required that the expectation 
value of the running time which S needs to get an accepting transcript is 
bounded by a polynomial in the binary length || of the common input z.? 

To be zero-knowledge, the ability to produce accepting transcripts by a 
simulation is not sufficient. The generation of transcripts, real or simulated, 
includes random choices. Thus, we have a probability distribution on the set 
of accepting transcripts. The last condition in the definition means that the 
probability distribution of the transcripts that are generated by the simulator 
S and V* is the same as if they were generated by the honest prover P and V*. 
Otherwise, the distribution of transcripts might contain information about 
the secret and thus reveal some of P’s knowledge. 

In the following, we will illustrate the notion of zero-knowledge and the 
simulation of a prover by the simplified version of the Fiat-Shamir identifi- 
cation (Protocol 4.5). 


Proposition 4.7. The simplified version of the Fiat-Shamir identification 
scheme is zero-knowledge. 


Proof. The set of accepting transcripts is 
T(x) := {(a,e,b) € QR,, x {0,1} x Z* | b? = ax®}. 


Let V* be a general (honest or cheating) verifier. Then, a simulator S with 
the desired properties is given by the following algorithm. 


? In other words, S is a Las Vegas algorithm (see Section 5.2). 
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Algorithm 4.8. 
transcript S(algorithm V*, int x) 
1 repeat 


2 
3 
4 e~V*(a) 
5 untile=€é 
6 return (a, é, b) 
The simulator S uses the verifier V* as a subroutine to get the challenge 
e. S tries to guess e in advance. If S$ succeeded in guessing e, he can produce 
a valid transcript (4, é, b). S cannot produce e by himself, because V* is an 
arbitrary verifier. Therefore, V* possibly does not generate the challenges e 
randomly, as it is specified in (P,V), and S must call V* to get e. Independent 
of the strategy that V* uses to output e, the guess € coincides with e with 
a probability of 1/9. Namely, if V* outputs 0 with a probability of p and 1 
with a probability of 1 — p, the probability that e = 0 and é = 0 is P/2, and 
the probability that e = 1 and é = 1 is (1 — P)/. Hence, the probability that 
one of both events occurs is 1/9. 
The expectation is that S$ will produce a result after two iterations of the 
while loop (see Lemma B.12). An element (4, é, 6) € T returned by S' cannot 
be distinguished from an element (a,e,b) produced by (P,V*): 


1. a and @ are random quadratic residues in QR,,. 
2. e and € are distributed according to V*. 
3. b and b are random square roots. 


This concludes the proof of the proposition. 


4.2.4 Fiat-Shamir Identification Scheme 


As in the simplified version of the Fiat-Shamir identification scheme, let n := 
pq, where p and q are distinct primes. Again, computations are performed in 
Zn, and we assume that it is intractable to compute square roots of randomly 
chosen elements in QR,,, unless the factorization of n is known (see Section 
4.2.2). In the simplified version of the Fiat-Shamir identification scheme, the 
verifier accepts the proof of a cheating prover with a probability of 1/2. To 
reduce this probability of success, now the prover’s secret is a vector y := 
(y1,---,Yt) of randomly chosen square roots. The modulus n and the vector 
x := (y?,...,y?) are made public. As above, we assume that Peggy chooses 
n and y, computes x and publishes the public key (n,) to all participants. 
Peggy’s secret is y. 


98 4. Cryptographic Protocols 


Protocol 4.9. 
Fiat-Shamir Identification: 


Repeat the following & times: 
1. Peggy chooses r € Z* at random and sets a := r?. Peggy sends 
a to Vic. 
2. Vic chooses € := (e€1,..-, e+) € {0,1}* at random. Vic sends e to 
Peggy. 
3. Peggy computes b := ria y,'. Peggy sends 6 to Vic. 
4. Vic rejects if b? # a]]j_, x, and stops. 


Security. The Fiat-Shamir identification scheme extends the simplified 
scheme in two aspects. First, a challenge e € {0,1} in the basic scheme 
is replaced by a challenge e € {0,1}’. Then the basic scheme is iterated k 
times. A cheating prover Eve can convince Vic if she guesses Vic’s challenge 
e correctly for each iteration; i.e., if she manages to select the right element 
from {0,1}**. Her probability of accomplishing this is 2~**. It can be shown 
that for t = O(logs(|n|)) and k = O(|n|"), the interactive proof system is still 
zero-knowledge. Observe here that the expected running time of a simulator 
that is constructed in a similar way as in the proof of Proposition 4.7 is no 
longer polynomial if t or k are too large. 


Completeness. If the legitimate prover Peggy and Vic follow the protocol, 
then Vic will accept. 


Soundness. Suppose a cheating prover Eve can convince (the honest veri- 
fier) Vic with a probability > 2~**, where the probability is taken over all 
the challenges e. Then Eve knows a vector A = (a!,..., a”) of commitments 
a) (one for each iteration j7, 1 < j < k) for which she can correctly answer 
two distinct challenges E = (e!,...,e*) and F = (f?,...,f*), E # F, of 
Vic. There is an iteration j, such that e? # f7, and Eve can answer both 
challenges e := e/ and f := f/ for the commitment a = a). This means that 
Eve can compute 6; and bz, such that 


t t 
Di. ej Yn fi 
bt =a] [2% and bs =a][<}'. 

i=l i=1 


As in Section 4.2.2, this implies that Eve can compute the square root y = 
bob, * of the random square a = tes at ‘~@ This contradicts our assumption 
that computing square roots is intractable without knowing the prime factors 
pand q of n. 


Remark. The number v of exchanged bits is k(2|n| +t), the average number 
w of multiplications is k(t + 2) and the key size s (equal to the size of the 
prover’s secret) is t]/n|. Choosing k and t appropriately, different values for 
the three numbers can be achieved. All choices of k and t, with kt constant, 
lead to the same level of security 2~*'. Keeping the product kt constant and 
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increasing t, while decreasing k, yields smaller values for vy and ps. However, 
note that the scheme is proven to be zero-knowledge only for small values of 
Gs 


4.2.5 Fiat-Shamir Signature Scheme 


[FiaSha86] gives a standard method for converting an interactive identifica- 
tion scheme into a digital signature scheme. Digital signatures are produced 
by the signer without interaction. Thus, the communication between the 
prover and the verifier has to be eliminated. The basic idea is to take the 
challenge bits, which in the identification scheme are generated by the ver- 
ifier, from the message to be signed. It must be guaranteed that the signer 
makes his commitment before he extracts the challenge bits from the mes- 
sage. This is achieved by the clever use of a publicly known, collision-resistant 
hash function (see Section 3.4): 


h: {0,1}* —> {0,1} **. 


As an example, we convert the Fiat-Shamir identification scheme (Section 
4.2.4) into a signature scheme. 


Signing. To sign a message m € {0,1}*, Peggy proceeds in three steps: 


1. Peggy chooses (r1,..., 7k) € Z* at random and sets a; := fx Sp Sk: 
2. She computes h(mlla;|| ... |a,) and writes these bits into a matrix, column 
by column: 


3. She computes 


and sets b = (b1,...,b,%). The signature of m is s = (b,e). 


Verification. To verify the signature s = (0, e) of the signed message (m, s), 
we compute 


t 
2 ej . 

Cj =) [2 7, 1Ll<g<k, 
i=l 


and accept if e = h(mlci|... cx). 
Here, the collision resistance of the hash function is needed. The verifier 
does not get the original values a; — from step 1 of the protocol — to test 


2 t ei; : 
aj = 05 [Tie 1 <7 <k. 


Remark. The key size is t|n| and the signature size is k(t + |n|). The scheme 
is proven to be secure in the random oracle model, i.e., under the assumption 
that the hash function h is a truly random function (see Section 10.1 for a 
detailed discussion of what the security of a signature scheme means). 
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4.3 Commitment Schemes 


Commitment schemes are of great importance in the construction of crypto- 
graphic protocols for practical applications (see Section 4.4.6), as well as for 
protocols in theoretical computer science. They are used, for example, to con- 
struct zero-knowledge proofs for all languages in NP (see [GolMicWid86]). 
This result is extended to the larger class of all languages in ZP, which is the 
class of languages that have interactive proofs (see [BeGrGwHakiMiRo88}). 

Commitment schemes enable a party to commit to a value while keeping 
it secret. Later, the committer provides additional information to open the 
commitment. It is guaranteed that after committing to a value, this value 
cannot be changed. No other value can be revealed in the opening step: if 
you have committed to 0, you cannot open 1 instead of 0, and vice versa. For 
simplicity (and without loss of generality), we only consider the values 0 and 
1 in our discussion. 

The following example demonstrates how to use commitment schemes. 
Suppose Alice and Bob are getting divorced. They have decided how to split 
their common possessions. Only one problem remains: who should get the 
car? They want to decide the question by a coin toss. This is difficult, be- 
cause Alice and Bob are in different places and can only talk to each other by 
telephone. They do not trust each other to report the correct outcome of a 
coin toss. This example is attributable to M. Blum. He introduced the prob- 
lem of tossing a fair coin by telephone and solved it using a bit-commitment 
protocol (see [Blum82]). 


Protocol 4.10. 
Coin tossing by telephone: 


1. Alice tosses a coin, commits to the outcome ba (heads = 0, tails 
= 1) and sends the commitment to Bob. 

2. Bob also tosses a coin and sends the result bp to Alice. 

3. Alice opens her commitment by sending the additional informa- 
tion to Bob. 


Now, both parties can compute the outcome ba 6 bp of the joint coin toss by 
telephone. If at least one of the two parties follows the protocol, i.e., sends 
the result of a true coin toss, the outcome is indeed a truly random bit. 


In a commitment scheme, there are two participants, the committer (also 
called the sender) and the receiver. The commitment scheme defines two 
steps: 


1. Commit. The sender sends the bit b he wants to commit to, in encrypted 
form, to the receiver. 

2. Reveal or open. The sender sends additional information to the receiver, 
enabling the receiver to recover b. 
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There are three requirements: 


1. Hiding property. In the commit step, the receiver does not learn anything 
about the committed value. He cannot distinguish a commitment to 0 
from a commitment to 1. 

2. Binding property. The sender cannot change the committed value after 
the commit step. This requirement has to be satisfied, even if the sender 
tries to cheat. 

3. Viability. If both the sender and the receiver follow the protocol, the 
receiver will always recover the committed value. 


4.3.1 A Commitment Scheme Based on Quadratic Residues 


The commitment scheme based on quadratic residues enables Alice to commit 
to a single bit. Let QR,, be the subgroup of squares in Z* (see Definition 
A.48), and let J;{1 := {x € Z* | (£) = 1} be the units modulo n with Jacobi 
symbol 1 (see Definition A.55). Let QNR*! := J+!\ QR,, be the non-squares 
in J: 


Protocol 4.11. 
QRCommitment: 


1. System setup. Alice chooses distinct large prime numbers p and 
q, and v € QNR??, n := pq. 

2. Commit. To commit to a bit b, Alice chooses r € Z, at random, 
sets c:= r?v? and sends n,c and v to Bob. 

3. Reveal. Alice sends p,q,r and b to Bob. Bob can verify that p 
and q are primes, n = pq, r € Z*, v € QR,, and c= rv". 


Ti? 
Remarks: 


1. There is an efficient deterministic algorithm which computes the Jacobi 
symbol (4) of x modulo n, without knowing the prime factors p and gq of 
n (Algorithm A.59). Thus, it is easy to determine whether a given x € Z* 
is in J;*1. However, if the factors of n are kept secret, no efficient algorithm 
is known that can decide whether a randomly chosen element in J;*! is 
a square, and it is assumed that no efficient algorithm exists for this 
question of quadratic residuosity (a precise definition of this quadratic 
residuosity assumption is given in Definition 6.11). On the other hand, if 
pand q are known it is easy to check whether v € J;*1 is a square. Namely, 
v is a square if and only if v mod p and v mod gq are squares, and this in 


turn is true if and only if the Legendre symbols (:) = v-))/2 mod p 
and (:) = v\t-1)/? mod q are equal to 1 (see Proposition A.52). 


2. If Bob could distinguish a commitment to 0 from a commitment to 1, he 
could decide whether a randomly chosen element in J;*+ is a square. This 
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contradicts the quadratic residuosity assumption stated in the preceding 
remark. 

3. The value c is a square if and only if v? is a square, i.e., if and only if 
b = 0. Since c is either a square or a non-square, Alice cannot change her 
commitment after the commit step. 

4. Bob needs p and q to check that v is not a square. By not revealing the 
primes p and q, Alice could use them for several commitments. Then, 
however, she has to prove that v is not a square. She could do this by an 
interactive zero-knowledge proof (see Exercise 3). 

5. Bob can use Alice’s commitment c and commit to the same bit b as Alice, 
without knowing b. He chooses 7 € Z* at random and sets ¢ = cf”. Bob 
can open his commitment after Alice has opened her commitment. If 
commitments are used as subprotocols, it might cause a problem if Bob 
blindly commits to the same bit as Alice. We can prevent this by asking 
Bob to open his commitment before Alice does. 


4.3.2 A Commitment Scheme Based on Discrete Logarithms 


The commitment scheme based on discrete logarithms enables Alice to com- 
mit to a message m € {0,...,q—1}. 


Protocol 4.12. 
LogCommitment: 


1. System setup. Bob randomly chooses large prime numbers p and 
q such that q divides p— 1. Then he randomly chooses g and v 
from the subgroup Gy of order q in Z), g,v # [1].2 Bob sends 
p,q,g and v to Alice. 

2. Commit. Alice verifies that p and q are primes, that q divides 
p—1and that g and v are elements of order g. To commit to 
m € {0,...,q—1}, she chooses r € {0,...,q—1} at random, sets 
c:= g'v™ and sends c to Bob. 

3. Reveal. Alice sends r and m to Bob. Bob verifies that c= g™v™. 


Remarks: 


1. Bob can generate p,q,g and v as in the DSA (see Section 3.5.3). 

2. If Alice committed to m and could open her commitment as m, 7m 4m, 
then g™v™ = g'v™ and log,(v) = (m—_m)~*(r —#).* Thus, Alice could 
compute log,(v) of a randomly chosen element v € Gy, contradicting 
the assumption that discrete logarithms of elements in Gy are infeasible 


3 There is a unique subgroup of order q in Z,,. It is cyclic and each element x € Z; 
of order q is a generator (see Lemma A.40). 

4 Note that we compute in Gg C Z;. Hence, computations like g"v’" are done 
modulo p, and since the elements in Gy have order g, exponents and logarithms 
are computed modulo gq (see “Computing modulo a prime” on page 303). 
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to compute (see the remarks on the security of the DSA at the end of 
Section 3.5.3, and Proposition 4.21). 

3. g and v are generators of Gg. g” is a uniformly chosen random element 
in Gy, perfectly hiding v™ and m in g"™v™, as in the encryption with a 
one-time pad (see Section 2.1). 

4. Bob has no advantage if he knows log, (v), for example by choosing v = g® 
with a random s. 


In the commitment scheme based on quadratic residues, the hiding property 
depends on the infeasibility of computing the quadratic residues property. 
The binding property is unconditional. The commitment scheme based on 
discrete logarithms has an unconditional hiding property, whereas the binding 
property depends on the difficulty of computing discrete logarithms. 

If the binding or the hiding property depends on the complexity of a com- 
putational problem, we speak of computational hiding or computational bind- 
ing. If the binding or the hiding property does not depend on the complexity 
of a computational problem, we speak of unconditional hiding or uncondi- 
tional binding. The definitions for the hiding and binding properties, given 
above, are somewhat informal. To define precisely what “cannot distinguish” 
and “cannot change” means, probabilistic algorithms have to be used, which 
we introduce in Chapter 5. 

It would be desirable to have a commitment scheme which features uncon- 
ditional hiding and binding. However, as the following considerations show 
this is impossible. Suppose a deterministic algorithm 


C: {0,1}" x {0,1} — {0,1}8 


defines a scheme with both unconditionally hiding and binding. Then when 
Alice sends a commitment c = C(r,b) to Bob, there must exist an 7 such that 
c = C(f,1— b). Otherwise, Bob could compute (r, 6) if he has unrestricted 
computational power, violating the unconditional hiding property. However, 
if Alice also has unrestricted computing power, then she can also find (7, 1—b) 
and open the commitment as 1 — b, thus violating the unconditional binding 
property. 


4.3.3 Homomorphic Commitments 


Let Com(r,m) := g’v™ denote the commitment to m in the commitment 
scheme based on discrete logarithms. Let 11, 1r2,7™m1, m2 € {0,...,q—1}. Then 


Com(r1,™1) - Com(r2, m2) = Com(r; + r2,m, + ma). 


Commitment schemes satisfying such a property are called homomorphic 
commitment schemes. 

Homomorphic commitment schemes have an interesting application in 
distributed computation: a sum of numbers can be computed without re- 
vealing the single numbers. This feature can be used in electronic voting 
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schemes (see Section 4.4). We give an example using the Com function just 
introduced. Assume there are n voters Vj,...,V,- For simplicity, we assume 
that only “yes-no” votes are possible. A trusted center T is needed to com- 
pute the outcome of the election. The center T is assumed to be honest. If 
T was dishonest, it could determine each voter’s vote. Let Er and Dr be 
ElGamal encryption and decryption functions for the trusted center T. To 
vote on a subject, each voter V; chooses m; € {0,1} as his vote, a random 
r; € {0,...,qg— 1} and computes ¢; := Com(r;,m;). Then he broadcasts c; 
to the public and sends Er(g"™) to the trusted center T. T computes 


n 


br({I Er(9")) =|] 9" =", 


i=l 


where r = )>\_, r;, and broadcasts g”. 
Now, everyone can compute the result s of the vote from the publicly 


known cj, 27 =1,...,n, and g’: 


n 

8 —r 

vag" [Tor 
w=1 


with 5 := 5>;_,mj;. s can be derived from v* by computing v,v?,... and 
comparing with v*® in each step, because the number of voters is not too 
large. If the trusted center is honest, the factor g’ — which hides V;’s vote — 
is never computed. Although an unconditional hiding commitment is used, 
the hiding property is only computational because g™ is encrypted with a 
cryptosystem that provides at most computational security. 


4.4 Electronic Elections 


In an electronic voting scheme there are two distinct types of participants: the 
voters casting the votes and the voting authority (for short, the authority) 
that collects the votes and computes the final tally. 

Usually the following properties are required: (1) universal verifiability 
ensures that the correctness of the election, especially the correct computa- 
tion of the tally, can be checked by everyone; (2) privacy ensures that the 
secrecy of an individual vote is maintained; and (3) robustness ensures that 
the scheme works even in the presence of a coalition of parties with faulty 
behavior. Naturally, only authorized voters should be allowed to cast their 
votes. 

There seems to be a conflict between the last requirement and privacy. 
The scheme we describe resolves this conflict by establishing a group of au- 
thorities and a secret sharing scheme. It guarantees privacy, even if some of 
the authorities collaborate. 
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4.4.1 Secret Sharing 


The idea of secret sharing is to start with a secret s and divide it into n pieces 
called shares. These shares are distributed among n users in a secure way. 
A coalition of some of the users is able to reconstruct the secret. The secret 
could, for example, be the password to open a safe shared by five people, 
with any three of them being able to reconstruct the password and open the 
safe. 

In a (t,n)-threshold scheme (t <n), a trusted center computes the shares 
8; of a secret s and distributes them among n users. Each ¢ of the n users 
are able to recover s from their shares. It is impossible to recover the secret 
from t — 1 or fewer shares. 


Shamir’s Threshold Scheme. Shamir’s threshold scheme is based on the 
following properties of polynomials over a finite field k. For simplicity, we 
take k = Zp, where p is a prime. 


Proposition 4.13. Let f(X) = es a;X' € Z,[X] be a polynomial of de- 
gree t—1, and let P:= {(a;, f(a:)) |v; € Zp, t=1,...,t,0, A 27,1 Aj}. For 
QC P, let Pa := {g € Zp[X] | deg(g) = t — 1, g(x) = y for all (x,y) € Q}. 


1. Pe ={f(X)}, te, f ts the only polynomial of degree t—1, whose graph 
contains all t points in P. 

2. If Q C P is a proper subset and x # 0 for all (x,y) € Q, then each 
a € Zy appears with the same frequency as the constant coefficient of a 
polynomial in Pg. 


Proof. To find all polynomials g(X) = ~{=) bi X* € Zp[X] of degree t — 1 
through m given points (2;, y;),1 <i < m, you have to solve the following 
linear equations: 


1 Ws aes oe bo Y1 
1 Um ++ ge be_-4 Ym 


If m =t, then the above matrix (called A) is a Vandermonde matrix and its 


determinant 
det A = II (x; = Xi) x 0, 


1<i<j<t 


if x; A x; for i # j. Hence,the system of linear equations has exactly one 
solution and part 1 of Proposition 4.13 follows. 

Now let Q C P be a proper subset. Without loss of generality, Q consists 
of the points (21, y1),---,(@m;Ym), 1 <m <t—1. We consider the following 
system of linear equations: 
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10... O bo a 
1 a,...a257 by YI 

: = (4.1) 
i es ere eet be Uni 


The matrix of the system consists of rows of a Vandermonde matrix (note 
all xz; #4 0 by assumption). Thus, the rows are linearly independent and 
the system (4.1) has solutions for all a € Z,. The matrix has rank m+ 1 
independent of a. Hence, the number of solutions is independent from a, and 
we see that each a € Zp appears as the constant coefficient of a polynomial 
in Pg with the same frequency. 


Corollary 4.14. Let f(X) = ee a;X* € Z,[X] be a polynomial of degree 
t—1, and let P= {(2;, f(vi)) |t=1,...,t,0; A 2j,t AJ}. Then 


t 


FX)=Do fe) TP (&%-25)@i- aj). (4.2) 


i=1 1<j<tj#i 
This formula is called the Lagrange interpolation formula. 


Proof. The right-hand side of (4.2) is a polynomial g of degree t — 1. If we 
substitute X by 2; in g, we get g(x;) = f(x;). Since the polynomial f(X) is 
uniquely defined by P, the equality holds. 


Now we describe Shamir’s (t,n)-threshold scheme. A trusted center T 
distributes n shares of a secret s € Z among n users P,,...,P,. To set up 
the scheme, the trusted center T proceeds as follows: 


1. T chooses a prime p > max(s,n) and sets ag := 8. 


2. T selects aj,...,a4-1 € {0,...,p—1} independently and at random, and 
t—1 


gets the polynomial f(X) = S>;—5 ai.X". 
3. T computes s; := f(t), i = 1,...,n (we use the values i = 1,...,n for 
simplicity; any n pairwise distinct values x; € {1,...,p — 1} could also 


be used) and transfers (2, s;) to the user P; in a secure way. 


Any group of t or more users can compute the secret. Let J Cc {1,...,n}, 
|J| =t. From Corollary 4.14 we get 


s=a=f(0)=S f@ [J] sG-)*=S0% [] iG-a7. 
ies jE d, ji ie] = GED, j#i 


If only t—1 or fewer shares are available, then each a € Z, is equally likely to 
be the secret (by Proposition 4.13). Thus, knowing only t— 1 or fewer shares 
provides no advantage over knowing none of them. 
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Remarks: 


1. Suppose that each a € Z, is equally likely as the secret for someone 
knowing only t—1 or fewer shares, as in Shamir’s scheme. Then the (t, )- 
threshold scheme is called perfect: the scheme provides perfect secrecy in 
the information-theoretic sense (see Section 9.1). The security does not 
rely on the assumed hardness of a computational problem. 

2. Shamir’s threshold scheme is easily extendable for new users. New shares 
may be computed and distributed without affecting existing shares. 

3. It is possible to implement varying levels of control. One user can hold 
one or more shares. 


4.4.2 A Multi-Authority Election Scheme 


For simplicity, we only consider election schemes for yes-no votes. The vot- 
ers want to get a majority decision on some subject. In the scheme that we 
describe, each voter selects his choice (yes or no), encrypts it with a homo- 
morphic encryption algorithm and signs the cryptogram. The signature shows 
that the vote is from an authorized voter. The votes are collected in a single 
place, the bulletin board. After all voters have posted their votes, an author- 
ity can compute the tally without decrypting the single votes. This feature 
depends on the fact that the encryption algorithm used is a homomorphism. 
It guarantees the secrecy of the votes. But if the authority was dishonest, 
she could decrypt the single votes. To reduce this risk, the authority con- 
sists of several parties and the decryption key is shared among these parties, 
by use of a Shamir (t,)-threshold scheme. Then at least t of the n parties 
must be dishonest to reveal a vote. First, we assume in our discussion that a 
trusted center T’ sets up the scheme. However, the trusted center is not really 
needed. In Section 4.4.6 we show that it is possible to set up the scheme by 
a communication protocol which is executed by the parties constituting the 
authority. 

The election scheme that we describe was introduced in [CraGenSch97]. 
The time and communication complexity of the scheme is remarkably low. 
A voter simply posts a single encrypted message, together with a compact 
proof that it contains a valid vote. 


The Communication Model. The members of the voting scheme commu- 
nicate through a bulletin board. The bulletin board is best viewed as publicly 
accessible memory. Each member has a designated section of the memory to 
post messages. No member can erase any information from the bulletin board. 
The complete bulletin board can be read by all members (including passive 
observers). We assume that a public-key infrastructure for digital signatures 
is used to guarantee the origin of posted messages. 


Setting up the Scheme. To set up the scheme, we assume for now that 
there is a trusted center T. The trusted center T’ chooses primes p and q, 
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such that q is a large divisor of p—1, and an element g € Z>, of order gq, as 
in the key generation procedure of the DSA (see Section 3.5.3). g generates 
the subgroup G, of order q of Z5.° 

Further, we assume that T chooses a secret key s € {0,...,q — 1} at 
random and publishes the public key h := g® to be used for ElGamal en- 
cryption with respect to the base g. A message m € Gy is encrypted as 
(c1,€2) = (g*%, hm), where a is a randomly chosen element in {0,...,q—1}. 
Using the secret key s the plaintext m can be recovered as m = c2c;* (see 
Section 3.5.1). The encryption is homomorphic: if (ci,c2) and (cj,ch) are 
encryptions of m and m’, then 


(c1, €2) . (ci, €) = (crc), C25) = (Cpres h%mh® m’) = Gre, ho+e'mm') 


is an encryption of mm’. 

Let Aj,..., A, be the authorities and V;,..., Vi be the voters in the elec- 
tion scheme. The trusted center T chooses a Shamir (t,)-threshold scheme. 
The secret encryption key s is shared among the n authorities. A; keeps her 
share (j,s;) secret. The values h; := g*/ are published on the bulletin board 
by the trusted center. 


Decryption. Everyone can decrypt a cryptogram c := (c1, C2) := (g*,h®m) 
with some help of the authorities, but without reconstructing the secret key 
s. Namely, the following steps are executed: 


1. Each authority A; posts w; := c,*s to the bulletin board. Here, we assume 
that the authority A; is honest and follows the protocol. In Section 4.4.3 
we will see how to check that she really does (by a proof of knowledge). 

2. Let J be the index set of a subset of t honest authorities. Then, everyone 
can recover m = cc; * as soon as all A,, j € J, have finished step 1: 


; a eNAG : 
cy" = Cy ies SjA5j,J II (c1*4) GT Il wo, 


jEd jet 
where 
Ayr= J] te-5)t. 
le J\{j} 


Vote Casting. Each voter V; selects his vote v; € {—1,1}, encodes v; as g” 
and encrypts g” by the ElGamal encryption: 


Ci = (C1, Ci,2) = (9 AMG"). 


He then signs it to guarantee the origin of the message and posts it to the 
bulletin board. Here we assume that V; follows the protocol and correctly 
forms c;. He has to perform a proof of knowledge which shows that he really 
does (see Section 4.4.3); otherwise his vote is invalid. 


° There is a unique subgroup of order q of Z,. It is cyclic and each element x € Z;, 
of order q is a generator (see Lemma A.40 and “Computing modulo a prime” on 
page 303). 
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Tally Computing. Assume that m votes were cast: 


1. 


Everyone can compute 


m m 
c= (c1,¢2) = [lau] ce : 
i=1 i=1 


Note that c = (ci,¢2) is the encryption of g’, where d is the difference 
between the number of yes votes and no votes, since the encryption is 
homomorphic. 


. The decryption protocol from above is executed to get g@. After suffi- 


ciently many authorities A; have posted w,; = ee to the bulletin board, 
everyone can compute gt: 

Now d can be found by computing the sequence g7 
comparing with g@ in each step. 


m 


1g! »++-; and 


Remarks: 


1. 


Everyone can check whether a voter or an authority was honest (see 
Section 4.4.3), and discard invalid votes. If he finds a subset of t honest 
authorities, he can compute the tally. This implies universal verifiability. 


. No coalition of tf — 1 or fewer authorities can recover the secret key. This 


guarantees the robustness of the scheme. 

Privacy depends on the security of the underlying ElGamal encryption 
scheme and, hence, on the assumed difficulty of the Diffie-Hellman prob- 
lem. The scheme provides only computational privacy. A similar scheme 
is introduced in [CraFraSchYun96] which even provides perfect privacy 
(in the information-theoretic sense). This is achieved by using a com- 
mitment scheme with information-theoretic secure hiding to encrypt the 
votes. 

The following remarks concern the communication complexity of the 
scheme: 

a. Each voter only has to send one message together with a compact 
proof that the message contains a valid vote (see below). His activities 
are independent of the number n of authorities. 

b. Each authority has to read m messages from the bulletin board, 
verify m interactive proofs of knowledge and post one message to the 
bulletin board. 

c. To compute the tally, you have to read t messages from the bulletin 
board and to verify ¢ interactive proofs of knowledge. 

It is possible to prepare an election beforehand. The voter V; chooses 
v, € {—1,1} at random. The voting protocol is executed with the random 
v; values. Later, the voter decides the alternative to choose. He selects 
0; € {-1,1}, such that 0;v; is his vote, and posts 0; to the bulletin board. 


The tally is computed with é; = (c?',,c%),i=1,...,m. 


1,1? i,2 , 
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4.4.3 Proofs of Knowledge 


Authority’s Proof. In the decryption protocol above, each authority A, 
has to prove that she really posts w; = ci’, where s; is her share of the secret 
key s. Recall that h; = g*/ is published on the bulletin board. The authority 
has to prove that w; and h; have the same logarithm with respect to the 
bases c; and g and that she knows this logarithm. We simplify the notation 
and describe an interactive proof of knowledge of the common logarithm x 
of y; = gf and yo = g3, where x is a random element from {0,...,q—1}. As 
usual, we call the prover Peggy and the verifier Vic. 

Of course, in our voting scheme it is desirable for practical reasons that 
the authority proves, in a non-interactive way, to be honest. However, it is 
easy to convert the interactive proof into a non-interactive one (see Section 
4.4.4). Thus, we first give the interactive version of the proof. 


Protocol 4.15. 
ProofLogEq(91, Y1, 92; Y2): 


1. Peggy chooses r € {0,...,qg — 1} at random and sets a := 
(a1, a2) = (g{, 93). Peggy sends a to Vic. 

2. Vic chooses c € {0,...,q — 1} uniformly at random and sends c 
to Peggy. 

3. Peggy computes b := r — cx and sends b to Vic. 

4. Vic accepts if and only if a; = g?yf and az = gbyS. 


The protocol is a three-move protocol. It is very similar to the protocol used 
in the simplified Fiat-Shamir identification scheme (see Section 4.2.2): 


1. The first message is a commitment by Peggy. She commits that two 
numbers have the same logarithm with respect to the different bases g 
and go. 

2. The second message c is a challenge by Vic. 

3. The third message is Peggy’s response. If c = 0, Peggy has to open 
the commitment (reveal r). If c 4 0, Peggy has to show her secret in 
encrypted form (reveal r — cx). 


Completeness. If Peggy knows a common logarithm for y; and ye, and 
both Peggy and Vic follow the protocol, then a, = g?yf and az = gbyS, and 
Vic will accept. 


Soundness. A cheating prover Eve can convince Vic with a probability of 
1/q in the following way: 


1. Eve chooses r,é € {0,...,q¢ — 1} at random, sets a := (gf yf, gsyS) and 
sends a to Vic. 

2. Vic chooses c € {0,...,q¢— 1} at random and sends c to Eve. 

3. Eve sends r to Vic. 
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Vic accepts if and only if c= c¢. The event c = ¢ occurs with a probability of 
1/q. Thus, Eve succeeds in cheating with a probability of 1/q. 

If Eve can convince Vic with a probability greater than 1/q (the probabil- 
ity is taken over the challenges c), she has to answer at least two challenges 
correctly, for a given commitment a. 

Suppose Eve knows an a = (a, a2) for which she can answer two distinct 
challenges c and c. This means that Eve can compute b and b, such that 


c b,.c 

a1 = 91Y1, 42 = Go¥o, 

_ 2b, é b,c 
a= 91¥1, 42=9 


Then she can compute 


and gets 


(b — b)(e— 2)“ = log, (yi) and (b— b)(c — 2)! = log, (y2). 


Thus, she can compute the secret x. This contradicts the assumption that it 
is infeasible to compute x from g® (for randomly chosen p,g and x). We see 
that the probability of success of a cheating prover is bounded by 1/q. 


Honest Verifier Zero-Knowledge. The above protocol is not known to be 
zero-knowledge. However, it is zero-knowledge if the verifier is an honest one. 
An interactive proof system (P,V) is called honest-verifier zero-knowledge 
if Definition 4.6 holds for the honest verifier V, but not necessarily for an 
arbitrary Verifier V*; i.e., there is a simulator S that produces correctly 
distributed accepting transcripts for executions of the protocol with (P,V). 

To simulate an interaction with the honest verifier is quite simple. The 
key point is that the honest verifier V chooses the challenge c € {0,...,q—1} 
independently from (a1, a2), uniformly and at random, and this can also be 
done by S. 


Algorithm 4.16. 
int S(int 91,92, Y1; ye) 
1 select b € {0,...,q—1} uniformly at random 
2 select ¢ € {0,...,qg— 1} uniformly at random (this is V’s task) 
3 di — giyt, a2 — ghys 
4 return (1, Go, ¢, b) 


The transcript (G1, da, é, b) returned by S is an accepting transcript and not 
distinguishable from a transcript (a1, a2,c,b) produced by (P,V): 


1. a1,@2 and @,@2 are randomly chosen elements in Zz. 
2. c and é are randomly chosen elements in {0,...,q— 1}. 
3. b and 0 are randomly chosen elements in {0,...,q— 1}. 
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Voter’s Proof. In the vote-casting protocol, each voter has to prove that he 
really encrypted a vote g’ € {g,g~'}; ie, he has to prove that 
c¢ = (c1,¢€2) = (g%,h®m) and m € {g,g~'}. For this purpose, he performs 
a proof of knowledge. He proves that he knows a for either c. = g® and 
cog = h, or for cy = g® and cag = h®. Each of the two alternatives could 
be proven as in the authority’s proof. Here, however, the prover’s task is more 
difficult. The proof must not reveal which of the two alternatives is proven. 
An interactive three-move proof that convinces the verifier without revealing 
anything about the prover’s choice is the subject of Exercise 9. 


4.4.4 Non-Interactive Proofs of Knowledge 


It is easy to convert an interactive three-move proof into a non-interactive 
one using the standard method of Fiat-Shamir, which we demonstrated in 
Section 4.2.5. Let h : {0,1}* —> Z, be a collision-resistant hash function. We 
get a non-interactive proof 


(c, b) = ProofLogEq,, (91, 1; 92, Y2); 


for yy = gf and y2 = g% in the following way. The prover Peggy chooses 
r € {0,...,¢g—1} at random and sets a := (a1,a2) = (g{,9$). Then she 
computes the challenge c := h(gi|\y1\|92|y2|ail|a2) and sets b := r — cr. The 
verification condition is 


c= h(gilylgelyela?ytlgsys). 


The verifier needs not know a to compute the verification condition. If we 
trust the collision resistance of h, we can conclude that u = v from h(u) = 
h(w). 

If we convert our proofs of knowledge into non-interactive proofs, honest- 
verifier zero-knowledge is sufficient, because here, the verifier is always honest. 
In our election protocol, each authority and each voter completes his message 
with a non-interactive proof which convinces everyone that he followed the 
protocol. 


4.4.5 Extension to Multi-Way Elections 


We describe how to extend the scheme if a choice between several, say | > 2, 
options should be possible. 

To encode the votes v;,...,vz, we independently choose | generators g;, 
j=l,...,l, of Gg, and encode v; by g;. Voter V; encrypts his vote vu; as 


Ci = (Cir, Ci,2) = (9%, h%g;). 


Each voter shows — by an interactive proof of knowledge — that he knows a 
with 
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a = a 
C1 = 9g" and gj G2 = h*, 


for exactly one j. We refer the interested reader to [CraGenSch97] for some 
hints about the proof technique used. 

The problem of computing the final tally turns out to be more compli- 
cated. The result of the tally computation is 


or ed dy dg di 
C20, = G1 Go ---G; 


where d; is the number of votes for vj. The exponents d; are uniquely 
determined by c2c;* in the following sense: computing a different solution 
(di; ee ,d,) would contradict the discrete logarithm assumption, because the 
generators were chosen independently (see Proposition 4.21 in Section 4.5.3). 
Again as above, d = (d),...,d;) can be found for small values of | and d; by 
searching. 


4.4.6 Eliminating the Trusted Center 


The trusted center sets up an ElGamal cryptosystem, generates a secret key, 
publishes the corresponding public key and shares the secret key among n 
users using a (t,n)-threshold scheme. To eliminate the trusted center, all 
these activities must be performed by the group of users (in our case, the 
authorities). For the communication between the participants, we assume 
below that a bulletin board, mutually secret communication channels and a 
commitment scheme exist. 


Setting Up an ElGamal Cryptosystem. We need large primes p, q, such 
that q is a divisor of p — 1, and a generator g of the subgroup Gy, of or- 
der q of Z}. They are generated jointly by the group of users (in our case 
the group of authorities). This can be achieved if each party runs the same 
generation algorithm (see Section 3.5.1). The random input to the generation 
algorithm, must be generated jointly. To do this, the users P;,..., Pn, execute 
the following protocol: 


1. Each user P; chooses a string r; of random bits of sufficient length, com- 
putes a commitment c; = C(r;) and posts the result to the bulletin 
board. 

2. After all users have posted their commitments, each user opens his com- 
mitment. 

3. They take r= @7_,r; as the random input. 


Publish the Public Key. To generate and distribute the public key the 
users P,,...,P, execute the following protocol: 


Li 


1. Each user P; chooses x; € {0,...,q¢— 1} at random, computes h; := 
and a commitment c; := C(h;) for h;. Then he posts c; to the bulletin 
board. 
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2. After all users have posted their commitments, each user opens his com- 
mitment. 
3. Everyone can compute the public key h := [[j_, hi. 


Share the Secret Key. The corresponding secret key 


n 
4 ) Xi 
i=1 


must be shared. The basic idea is that each user constructs a Shamir (t,7)- 
threshold scheme to share his part x; of the secret key. These schemes are 
combined to get a scheme for sharing x. 

Let fi(X) € Z,[X] be the polynomial of degree t — 1 used for sharing 2;, 
and let 


f(X) = AO. 


Then f(0) = x2 and f can be used for a (t,n)-threshold scheme, provided 
deg(f) =t— 1. The shares f(j) can be computed from the shares f;(j): 


{Gis fil). 


The group of users executes the following protocol to set up the threshold 
scheme: 


1. The group chooses a prime p > max(z,n). 

2, Hach: user .P;, randomly chooses f; + € {0,...,0 = 1} 7 = iano t= 1, 
and sets fi,o = iy and fil X) = sar fig X?. He posts ace = gh, 
j =1,...,t—1, to the bulletin board. Note that F;,9 = hi is P;’s piece 
of the public key and hence is already known. 

3. After all users have posted their encrypted coefficients, each user tests 
whether >", f;(X) has degree t — 1, by checking 


II Fi 4-1 (1). 
=I 


In the rare case that the test fails, they return to step 2. If f(X) passes 
the test, the degree of f(X) is t — 1. 

4. Each user P; distributes the shares s;, = fi(l),! =1,...,n, of his piece 2; 
of the secret key to the other users over secure communication channels. 


5. Each user P; verifies for / = 1,...,n that the share s;; received from 
P, is consistent with the previously posted committed coefficients of P;’s 
polynomial: 

t-1 

, iJ 

gt = [[(y)”. 
j=0 


If the test fails, he stops the protocol by broadcasting the message 
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“failure, (1,7), $1,:”. 


6. Finally, P; computes his share s,; of 2: 


n 
si = S S145 
l=1 


signs the public key h and posts his signature to the bulletin board. 


After all members have signed h, the group will use A as their public key. If 
all participants followed the protocol, the corresponding secret key is shared 
among the group. The protocol given here is described in [Pedersen91]. It only 
works if all parties are honest. [GenJarKraRab99] introduces an improved 
protocol which works if a majority of the participants is honest. 


4.5 Digital Cash 


The growth of electronic commerce in the Internet requires digital cash. To- 
day, credit cards are used to pay on the Internet. Transactions are online; i.e., 
all participants — the customer, the shop and the bank — are involved at the 
same time (for simplicity, we assume only one bank). This requires that the 
bank is available even during peak traffic time, which makes the scheme very 
costly. Exposing the credit card number to the vendor provides him with the 
ability to impersonate the customer in future purchases. The bank can easily 
observe who pays which amount to whom and when, so the customer cannot 
pay anonymously, as she can with ordinary money. 

A payment with ordinary money requires three different steps. First, the 
customer fetches some money from the bank, and his account is debited. 
Then he can pay anonymously in a shop. Later, the vendor can bring the 
money to the bank, and his account is credited. 

Ordinary money has an acceptable level of security and functions well for 
its intended task. Its security is based on a complicated and secret manufac- 
turing process. However, it is not as secure in the same mathematical sense 
as some of the proposed digital cash schemes. 

Digital cash schemes are modeled on ordinary money. They involve three 
interacting parties: the bank, the customer and the shop. The customer and 
the shop have accounts with the bank. A digital cash system transfers money 
in a secure way from the customer’s account to the shop’s account. In the 
following, the money is called an electronic coin, or coin for short. 

As with ordinary money, paying with digital cash requires three steps: 


1. The customer fetches the coin from the bank: customer and bank execute 
the withdrawal protocol. 

2. The customer pays the vendor: customer and vendor execute the payment 
protocol. 
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3. The vendor deposits the coin on his account: vendor and bank execute 
the deposit protocol. 


In an offline system, each step occurs in a separate transaction, whereas in 
an online system, steps 2 and 3 take place in a single transaction involving 
all three parties. 

The bank, the shop and the customer have different security requirements: 


1. The bank is assured that only a previously withdrawn coin can be de- 
posited. It must be impossible to deposit a coin twice without being 
detected. 

2. The customer is assured that the shop will accept previously withdrawn 
coins and that he can pay anonymously. 

3. In an offline system, the vendor is assured that the bank will accept a 
payment he has received from the customer. 


It is not easy to enable anonymous payments, for which it must be im- 
possible for the bank to trace a coin, i.e., to link a coin from a withdrawal 
with the corresponding coin in the deposit step. This requirement protects 
an honest customer’s privacy, but it also enables the misuse by criminals. 

Thus, to make anonymous payment systems practical, they must imple- 
ment a mechanism for tracing a coin. It must be possible to revoke the cus- 
tomer’s anonymity under certain well-defined conditions. Such systems are 
sometimes called fair payment systems. 

In the scheme we describe, anonymity may be revoked by a trusted third 
party called the trusted center. During the withdrawal protocol, the customer 
has to provide data which enables the trusted center to trace the coin. The 
trusted center is only needed if someone asks to revoke the anonymity of 
a customer. The trusted center is not involved if an account is opened or 
a coin is withdrawn, paid or deposited. Using the secret sharing techniques 
from Section 4.4.1, it is easy to distribute the ability to revoke a customer’s 
anonymity among a group of trusted parties. 

A customer’s anonymity is achieved by using blind signatures (see Section 
4.5.1). The customer has to construct a message of a special form, and then 
he hides the content. The bank signs the hidden message, without seeing its 
content. Older protocols use the “cut and choose method” to ensure that the 
customer formed the message correctly: the customer constructs, for example, 
1000 messages and sends them to the bank. The bank selects one of the 1000 
messages to sign it. The customer has to open the remaining 999 messages. 
If all these messages are formed correctly, then, with high probability, the 
message selected and signed by the bank is also correct. In the system that 
we describe, the customer proves that he constructed the message m correctly. 
Therefore, only one message is constructed and sent to the bank. This makes 
digital cash efficient. 

Ordinary money is transferable: the shop does not have to return a coin 
to the bank after receiving it. He can transfer the money to a third person. 
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The digital cash system we introduce does not have this feature, but there 
are electronic cash systems which do implement transferable electronic coins 
(e.g. [OkaOht92]; [ChaPed92}). 


4.5.1 Blindly Issued Proofs 


The payment system which we describe provides customer anonymity by 
using a blind digital signature.© Such a signature enables the signer (the 
bank) to sign a message without seeing its content (the content is hidden). 
Later, when the message and the signature are revealed, the signer is not able 
to link the signature with the corresponding signing transaction. The bank’s 
signature can be verified by everyone, just like an ordinary signature. 


The Basic Signature Scheme. Let p and q be large primes, such that q 
divides p— 1. Let G, be the subgroup of order q in Z>, g a generator of 
G,’ and h: {0,1}* —> Z, be a collision-resistant hash function. The signer’s 
secret key is a randomly chosen x € {0,...,qg — 1}, and the public key is 
(p,9,9,y), where y = g®. 

The basic protocol in this section is Schnorr’s identification protocol. It 
is an interactive proof of knowledge. The prover Peggy proves to the verifier 
Vic that she knows «, the discrete logarithm of y. 


Protocol 4.17. 
Proof.og(9,y): 


1. Peggy randomly chooses r € {0,...,q — 1}, computes a := g” 
and sends it to Vic. 

2. Vic chooses c € {0,...,q¢— 1} at random and sends it to Peggy. 

3. Peggy computes b := r — cx and sends it to Vic. 

4. Vic accepts the proof if a = g’y°; otherwise he rejects it. 


To achieve a signature scheme, the protocol is converted into a non- 
interactive proof of knowledge ProofLog, using the same method as shown 
in Sections 4.4.4 and 4.2.5: the challenge c is computed by means of the 
collision-resistant hash function h. 

A signature o(m) of a message m consists of a non-interactive proof that 
the signer (prover) knows the secret key x. The proof depends on the message 
m, because when computing the challenge c, the message m serves as an 
additional input to h: 


a(m) = ie b) = ProofLog;,(m, g; y), 


® Blind signatures were introduced by D. Chaum in [Chaum82] to enable untrace- 
able electronic cash. 

” As stated before, there is a unique subgroup of order q of Z;,. It is cyclic and 
each element x € Z> of order q is a generator (see Lemma A.40 and “Computing 
modulo a prime” on page 303). 


118 4. Cryptographic Protocols 


where c := h(mla),a := g",r € {0,...,q—1} chosen at random, and 6 := 
r — ca. The signature is verified by checking the condition 


c= h(mljg’y°). 


Here, the collision resistance of the hash function is needed. The verifier does 
not get a to test a = g’y°. If m is the empty string, we omit m and write 


(c,b) = ProofLog;,(g, y). 


In this way, we attain a non-interactive proof that the prover knows x = 
log,(y). 


Remarks: 


1. Asin the commitment scheme in Section 4.3.2 and the election protocol in 
Section 4.4, the security relies on the assumption that discrete logarithms 
of elements in Gy are infeasible to compute (also see the remarks on the 
security of the DSA at the end of Section 3.5.3). 

2. If the signer uses the same r (i.e., he uses the same commitment a = g") 
to sign two different messages m; and m2, then the secret key x can 
easily be computed:® 
Let o(m;) := (c&,b;),i = 1,2. We have g” = g’ty™ = g’2y and derive 
x = (by — b2)(e2 — c)~1. Note that c, # co for m, # ma, since h is 
collision resistant. 


The Blind Signature Scheme. The basic signature scheme can be trans- 
formed into a blind signature scheme. To understand the ideas, we first recall 
our application scenario. The customer (Vic) would like to submit a coin 
to the shop (Alice). The coin is signed by the bank (Peggy). Alice must be 
able to verify Peggy’s signature. Later, when Alice brings the coin to the 
bank, Peggy should not be able to recognize that she signed the coin for Vic. 
Therefore, Peggy has to sign the coin blindly. Vic obtains the blind signature 
for a message m from Peggy by executing the interactive protocol ProofLog 
with Peggy. In step 2 of the protocol, he deviates a little from the original 
protocol: as in the non-interactive version ProofLog,, the challenge is not 
chosen randomly, but computed by means of the hash function h (with m as 
part of the input). We denote the transcript of this interaction by (@, é, b); 
ie., @ is Peggy’s commitment in step 1, @ is Vic’s challenge in step 2 and 6 is 
sent from Peggy to Vic in step 3. (Z, 6) is a valid signature with a@ = g°y°. 
Now Peggy may store Vic’s identity and the transcript T = (@,@,b), but 
later she should not be able to recognize the signature o(m) of m. Therefore, 
Vic must transform Peggy’s signature (@, b) into another valid signature (c, b) 
of Peggy that Peggy is not able to link with the original transcript (@,@, 6). 


8 The elements in Gy have order q, hence all exponents and logarithms are com- 
puted modulo q. See “Computing modulo a prime” on page 303. 
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The idea is that Vic transforms the given transcript 7 into another accepting 
transcript 7 = (a,c, b) of ProofLog by the following transformation: 
Boigwy eG X Ze — Gy x Zi, (a, 7, b) +> (a,c,b), where 
a:=arg’y”, 


ci= uc+ wv, 


b:=ubtu. 


We have _ a 
a= a g’y” = (g’y®)" gy” = grigiene = gy’. 

Thus, 7 is indeed an accepting transcript of ProofLog. If Vic chooses u,v, w € 
{0,...,q} at random, then, 7 and 7 are independent, and Peggy cannot get 
any information about 7 by observing the transcript 7. Namely, given T = 
(G, Z, b), each (a,c, b) occurs exactly q times among the {(u,v,~)(@, ¢, 6), where 
(u,v,w) € Z. Thus, the probability that 7 = (a, b,c) is Vic’s transformation 
of (@,@,b) is 


|{(u, v, w) € Ze | Bea aay Gs CG, b) = (a, ¢, b)}| -~2 


and this is the same as the probability of 7 = (a,b,c) if we randomly (and 
uniformly) select a transcript from the set T of all accepting transcripts of 
ProofLog. We see that Peggy’s signature is really blind — she has no better 
chance than guessing at random to link her signature (c, b) with the original 
transcript (@, @, b). Peggy receives no information about the transcript 7 from 
knowing 7 (in the information-theoretic sense, see Appendix B.4). 

On the other hand, the following arguments give evidence that for Vic, 
the only way to transform (a, é, b) into another accepting, randomly looking 
transcript (a,b,c) is to randomly choose (u,v, w) and to apply G(u,1,w)- First, 
we observe that Vic has to set a = @g’y” (for some u,v, w). Namely, assume 
that Vic sets a = @“g”’y”g’ with some randomly chosen g’. Then he gets 


a= a“g’ yg’ = (°°) 0 gy’ = qiotryuctw g! 
and, since (a,c, b) is an accepting transcript, it follows that 


ubtvu, utt+tw os boc 


g y Ge GY 


or 


gh ubte) tele (uct) = q'. 

This equation shows that Peggy and Vic together could compute log,(g’). 
This contradicts the discrete logarithm assumption, because g’ was chosen at 
random. 
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If (a,b,c) is an accepting transcript, we get from a = G“g’y” that 


— \u = a 
gy? =aq= a g’y” = (°°) g’y” = Gere. 


and conclude that 


(ub+v)—b c—(ut+w) 


g =Y : 


This implies that 
b=ub+vand c= ut+ wu, 
because otherwise Vic could compute Peggy’s secret key x as 
(ub + v —b)(c— ue—w)7?. 
Our considerations lead to Schnorr’s blind signature scheme. In this 
scheme, the verifier Vic gets a blind signature for m from the prover Peggy 
by executing the following protocol. 


Protocol 4.18. 
BlindLogSign, (m): 


1. Peggy randomly chooses F € {0,...,q¢ — 1}, computes @ := g” 
and sends it to Vic. 


2. Vic chooses u,v,w € {0,...,q¢—1}, wu #0, at random and com- 
putes a = @“g’y”, c:= h(mlja) and @ := (c— w)u7?. Vic sends @ 
to Peggy. 


3. Peggy computes b := F — cx and sends it to Vic. 
4. Vic verifies whether @ = g’y°, computes b := ub+v and gets the 
signature o(m) := (c,b) of m. 


The verification condition for a signature (c,b) is c= h(mlg°y°). 

A dishonest Vic may try to use a blind signature (c,b) for more than 
one message m. This is prevented by the collision resistance of the hash 
function h. In Section 4.5.2 we will use the blind signatures to form digital 
cash. A coin is simply the blind signature (c,b) issued by the bank for a 
specific message m. Thus, there is another basic security requirement for blind 
signatures: Vic should have no chance of deriving more than one (c, b) from 
one transcript (@, @, ), i.e., from one execution of the protocol. Otherwise, in 
our digital cash example, Vic could derive more than one coin from the one 
coin issued by the bank. The Schnorr blind signature scheme seems to fulfill 
this security requirement. Namely, since h is collision resistant, Vic can work 
with at most one a. Moreover, Vic can know at most one triplet (u,v, w), with 
a=@"g’y”. This follows from Proposition 4.21 below, because G, g and y are 
chosen independently. Finally, the transformations @(,,y,~) are the only ones 
Vic can apply, as we saw above. However, we only gave convincing evidence 
for the latter statement, not a rigorous mathematical proof. 
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There is no mathematical security proof for either Schnorr’s blind sig- 
nature scheme or the underlying Schnorr identification scheme. A modifica- 
tion of the Schnorr identification scheme, the Okamoto-Schnorr identifica- 
tion scheme, is proven to be secure under the discrete logarithm assump- 
tion ([Okamoto92]). [PoiSte2000] gives a security proof for the Okamoto- 
Schnorr blind signature scheme, derived from the Okamoto-Schnorr identifi- 
cation scheme. It is shown that no one can derive more than / signed messages 
after receiving / blind signatures from the signer. The proof is in the so-called 
random oracle model: the hash function is assumed to behave like a truly ran- 
dom function. Below we use Schnorr’s blind signature scheme, because it is 
a bit easier. 


If we use BlindLogSig, as a subprotocol, we simply write 
(c, b) = BlindLogSig,, (m). 


The Blindly Issued Proof that Two Logarithms are Equal. As before, 
let p and q be large primes, such that q divides p— 1, and let G, be the 
subgroup of order q in Z5, g a generator of G, and h: {0,1}* —+ Z, be a 
collision-resistant hash function. Peggy’s secret is a randomly chosen x € 
{0,...,q—1}, whereas y = g® is publicly known. 

In Section 4.4.3 we introduced an interactive proof ProofLogEq(g, y, g, 9) 
that two logarithms are equal. Given a y with y = g”, Peggy can prove to the 
verifier Vic that she knows the common logarithm x of y and % (with respect 
to the bases g and g). This proof can be transformed into a blindly issued 
proof. Here it is the goal of Vic to obtain, by interacting with Peggy, a z 
with z= m® for a given message m € Gy, together with a proof (c, b) of this 
fact which he may then present to somebody else (recall that x is Peggy’s 
secret and is not revealed). Now, Peggy should issue this proof blindly, i.e., she 
should not see m and z during the interaction, and later she should not be able 
to link (m, z,c,b) with this issuing transaction. We proceed in a similar way 
as in the blind signature scheme. Vic must not give m to Peggy, so he sends 
a cleverly transformed m. Peggy computes Z = m*”, and then both execute 
the ProofLogEq protocol (Section 4.4.3), with the analogous modification 
as above: Vic does not choose his challenge randomly, but computes it by 
means of the hash function h (with m as part of the input). This interaction 
results in a transcript T = (™M, Z,@1,@2,C,b). Finally, Vic transforms 7 into 
an accepting transcript 7 = (m, z, a1, a2, c, b) of the non-interactive version of 
ProofLogEgq (see Section 4.4.4). Since 7 appears completely random to Peggy, 
she cannot link 7 to the original 7. Vic uses the following transformation: 


Bau WU, WwW 


a= ag y , 
ay = (BTZ")* (ag y")', 


ci=uc+ uU, 


b:= ub+t+y, 
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As above, a straightforward computation shows that a, = g’y° and az = 
mz°. Thus, the transcript 7 is indeed an accepting one. Analogous argu- 
ments as those given for the blind signature scheme show that the given 
transformation is the only way for Vic to obtain an accepting transcript, and 
that 7 and 7 are independent if Vic chooses u,v, w,s,t € {0,...,q—1} at 
random. Hence, the proof is really blind — Peggy gets no information about 
the transcript 7 from knowing 7. 

Our considerations lead to the following blind signature scheme. The sig- 
nature is issued blindly by Peggy. If Vic wants to get a signature for m, he 
transforms m to ™m and sends it to Peggy. Peggy computes 7 = ™” and ex- 
ecutes the ProofLogEq protocol with Vic to prove that log;(Z) = log,(y). 
Vic derives z and a proof that log,,,(z) = log,(y) from Z and the proof that 
log;(Z) = log, (y). The signature of m consists of z = m* and the proof that 


log,,(2) = log,(y). 


Protocol 4.19. 
BlindLogEqSign (g,y,m): 


1. Vic chooses s,t € {0,...,q¢—1}, s £ 0, at random, computes 
™m := m'/*g—t/s and sends ™ to Peggy.® 

2. Peggy randomly chooses 7 € {0,...,q—1} and computes 7 := ™” 
and @ := (@1,G2) = (g",m"). Peggy sends (Z,@) to Vic. 

3. Vic chooses u,v,w € {0,...,¢—1}, u # 0, at random, and 
coniputes a4 := ay g°y™) ay <= (asm 2”) (atg’y”)° and, z= 
zy. Then Vic computes c := h(m|lz||aija2) and ¢ := (c—w)u7t," 
and sends € to Peggy. 

4. Peggy computes b := 7 — éx and sends it to Vic. 


b 


5. Vic verifies whether a, = g’y* and a = mz, computes b := 


ub + v and receives (z,c,b) as the final result. 
If Vic presents the proof to Alice, then Alice may verify the proof by checking 
the verification condition c = h(ml|z|g°y°|m?z°). 
Below we will use BlindLogEqSig, as a subprotocol. Then we simply write 


(z,c, 6) = BlindLogEdSig, (g, y, m). 
Remarks: 


1. Again the collision resistance of h implies that Vic can use the proof 
(z,c,b) for only one message m. As before, in the blind signature proto- 
col BlindLogSig, we see that Vic cannot derive two different signatures 


° As is common practice, we denote the s-th root a® of an element x of order 
q by rll’, Here, s~' is the inverse of s modulo q. See “Computing modulo a 
prime” on page 303. 
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(z,c,b) and (Z,é, b) from one execution of the protocol, i.e., from one 
transcript (77, Z, G1, G2, ¢, b). 

2. The protocols BlindLogEqSig, and BlindLogSig; may be merged to yield 
not only a signature of m but also a signature of an additionally given 
message M. Namely, if Vic computes c = h(M||m|z|la;|a2) in step 3, 
then (c, b) is also a blind signature of M, formed in the same way as the 
signatures of BlindLogSig,. We denote this merged protocol by 


(z, Cc, b) = BlindLogEqSig, (M, g,Y; m), 


and call it the proof BlindLogEqSig, dependent on the message M. It 
simultaneously gives signatures of M and m (consisting of z and a proof 


that log,(y) = log,,(z)). 


4.5.2 A Fair Electronic Cash System 


The payment scheme we describe is published in [CamMauSta96]. A coin 
is a bit string that is (blindly) signed by the bank. For simplicity, we re- 
strict the scheme to a single denomination of coins: the extension to multiple 
denominations is straightforward, with the bank using a separate key pair 
for each denomination. As discussed in the introduction of Section 4.5, in a 
fair electronic cash system the tracing of a coin and the revoking of the cus- 
tomer’s anonymity must be possible under certain well-defined conditions; for 
example to track kidnappers who obtain a ransom as electronic cash. Here, 
anonymity may be revoked by a trusted third party, the trusted center. 


System Setup. As before, let p and q be large primes such that q divides 
p—1. Let G, be the subgroup of order g in Z). Let g, gi and go be randomly 
and independently chosen generators of Gy. The security of the scheme re- 
quires that the discrete logarithm of none of these elements with respect to an- 
other of these elements is known. Since g, gi and gz are chosen randomly and 
independently, this is true with a very high probability. Let h : {0,1}* — Z, 
be a collision-resistant hash function. 


1. The bank chooses a secret key x € {1,...,q—1} at random and publishes 
y= 9". 

2. The trusted center T chooses a secret key xr € {1,...,q—1} at random 
and publishes yr = g57. 

3. The customer — we call her Alice — has the secret key xc € {1,...,q—1} 
(randomly selected) and the public key yc = 91°. 

4. The shop’s secret key xg is randomly selected in {1,...,qg—1} and ys = 


g;° is the public key. 


By exploiting data observed by the bank in the withdrawal protocol, the 
trusted center can provide information which enables the recognition of a coin 
withdrawn by Alice in the deposit step. This trace mechanism is called coin 
tracing. Moreover, from data observed by the bank in the deposit protocol, 
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the trusted center can compute information which enables the identification 
of the customer. This tracing mechanism is called owner tracing. 


Opening an Account. When the customer Alice opens an account, she 
proves her identity to the bank. She can do this by executing the protocol 
ProofLog(gi, yc) with the bank. The bank then opens an account and stores 
yc in Alice’s entry in the account database. 


The Online Electronic Cash System. We first discuss the online system. 
The trusted center is involved in every withdrawal transaction, and the bank 
is involved in every payment. 


The Withdrawal Protocol. As before, Alice has to authenticate herself 
to the bank. She can do this by proving that she knows xq = log,, (yc). To 
get a coin, Alice executes the withdrawal protocol with the bank. It is a 
modification of the BlindLogSig, protocol given in Section 4.5.1. Essentially, 
a coin is a signature of the empty string blindly issued by the bank. 


Protocol 4.20. 
Withdrawal: 


1. The bank randomly chooses 7 € {0,...,q — 1} and computes 
a=’. The bank sends @ to Alice. 

2. Alice chooses u,v,w € {0,...,q¢—1}, u 4 0, at random, and 
computes a := a“g’y” and c := h(a),¢ := (c— w)u7t. Alice 
sends (u,v,w) encrypted with the trusted center’s public key 
and € to the bank. 

3. The bank sends @ and € and the encrypted (wu, v, w) to the trusted 
center T’. 

4. T checks whether ué + w = h(a“g’y”), and sends the result to 
the bank. 

5. If the result is correct, the bank computes b := 7 — Gx and sends 
it to Alice. Alice’s account is debited. 

6. Alice verifies whether @ = g’y°, computes b := ub + v and gets 
as the coin the signature o := (c,b) of the empty message. 


Payment and Deposit. In online payment, the shop must be connected to 
the bank when the customer spends the coin. Payment and deposit form one 
transaction. Alice spends a coin by sending it to a shop. The shop verifies 
the coin, i.e., it verifies the signature by checking c = h(g’y°). If the coin is 
valid, it passes the coin to the bank. The bank also verifies the coin and then 
compares it with all previously spent coins (which are stored in the database). 
If the coin is new, the bank accepts it and inserts it into the database. The 
shop’s account is credited. 


Coin and Owner Tracing. The trusted center can link (a, 2, b) and (c,d), 
which enables coin and owner tracing. 
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Customer Anonymity. The anonymity of the customer relies on the fact 
that the used signature scheme is blind and on the security of the encryption 
scheme used to encrypt the blinding factors u,v and w. 


The Offline Electronic Cash System. In the offline system, the trusted 
center is not involved in the withdrawal transaction, and the bank is not 
involved in the payment protocol. To achieve an offline trusted center, the 
immediate check that is performed by the trusted center in the withdrawal 
protocol above is replaced by a proof that Alice correctly provides the in- 
formation necessary for tracing the coin. This proof can be checked by the 
bank. To obtain such a proof, the BlindLogSig,-protocol is replaced by the 
BlindLogEqSig,, protocol (see Section 4.5.1). 

Essentially, a coin consists of a pair (m,z) with m = g1g$, where s is 
chosen at random and z = m”, and a proof of this fact which is issued 
blindly by the bank. For this purpose the BlindLogEqSig, protocol is exe- 
cuted: the bank sees (m = m1/*,z = z!/%). We saw above that the general 
blinding transformation in the BlindLogEqSig,, protocol is 7m = m!/%g~*. 
Here Alice chooses t = 0, i.e., m = ™m*; otherwise Alice could not per- 
form ProofLog,, (M ; mgi_',g2) as required in the payment protocol (see 
below). The blinding exponent s is encrypted by d = yj. = g57*, where 
yr is the trusted center’s public key. This enables the trusted center to re- 
voke anonymity later. It can get m by decrypting d (note that m = gig§ = 
gid'/*r), and the coin can be traced by linking m and 7. 


The Withdrawal Protocol. As before, first Alice has to authenticate her- 
self to the bank. The withdrawal protocol in the offline system consists of two 
steps. In the first of these steps, Alice generates an encryption of the message 
m to enable anonymity revocation by the trusted center. In the withdrawal 
step, Alice executes a message-dependent proof BlindLogEqSig,, with the 
bank to obtain a (blind) signature on her coin: 


1. Enable coin and owner tracing. Coin tracing means that, starting from 
the information gathered during the withdrawal the bank recognizes a 
specific coin in the deposit protocol. Owner tracing identifies the with- 
drawer of a coin, starting from the deposited coin. Tracings require the 
cooperation of the trusted center. To enable coin and owner tracing, Al- 
ice encrypts the message m with the trusted center’s public key yr and 
proves that she correctly performs this encryption: 

a. Alice chooses s € {1,...,q— 1} at random, then computes 


l/s 


m= 9193,d = yp,m=m t/°g and 


g 
(c, b) = ProofLogEq,, (7793", 91, YT; d). 


By the proof, Alice shows that logy. 1 (91) = log,,,.(d) and that she 
knows this logarithm. Alice sends (6; b, Ms 91; UT; d) to the bank. 
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b. The bank verifies the proof. If the verification condition holds, then 
the bank stores d in Alice’s entry in the withdrawal database for a 
possible later anonymity revocation. 

2. The withdrawal of a coin. Alice chooses r € {0,...,q — 1} at random, 
computes the so-called coin number c# = g” and executes 


(2, C1, b1) = BlindLogEqSig, (c#, g,Y; m) 


with the bank. Here, in the first step of BlindLogEqSig,, Alice takes the s 
from the coin and owner tracing step 1 and t = 0. Thus, she sends the m = 
mils = on! “go from step 1 to the bank. The variant of BlindLogEqSig,, 
is used that, in addition to the signature of m, gives a signature of c# by 
the bank (see the remarks after Protocol 4.19). The coin number c# is 
needed in the payment below. The bank debits Alice’s account. The coin 
consists of (c1,01,c#,9,y,m, Zz) and some additional information Alice 
has to supply when spending it (see the payment protocol below). 


Payment. In an offline system, double spending cannot be prevented. It 
can only be detected by the bank in the deposit protocol. An additional 
mechanism in the protocols is necessary to decide whether the customer or 
the shop doubled the coin. If the coin was doubled by the shop, the bank 
refuses to credit the shop’s account. If the customer doubled the coin, her 
anonymity should be revoked without the help of the trusted center. This can 
be achieved by the payment protocol. For this purpose, Alice, when paying 
with the coin, has to sign the message M = yg|time|(c1, 61), where yg is the 
public key of the shop, time is the time of the payment and (ci, 61) is the 
bank’s blind signature on the coin from above. Alice has to sign by means 
of the basic signature scheme. There she has to use s = log,, (mgi—') as her 
secret and the coin number c# = g’, which she computed in the withdrawal 
protocol, as her commitment a. 


o(M) = (C2, by) = ProofLog), (ae, magi ',92) 


Now, if Alice spent the same coin (c1, bi, c#, 9, y,m, 2) twice, she would pro- 
vide two signatures o(M) and o(M’) of different messages M and M’ (at 
least the times differ!). Both signatures are computed with the same com- 
mitment a = c#. Then, it is easy to identify Alice (see below). The coin 
submitted to the shop is defined by: 


coin = (ce by, CH, IY, ™M, z), (co, bo, M, g2,mg;") ) . 


The shop verifies the coin, i.e., it verifies: 


1. The correct form of M. 
2. Whether co = h(M|c#). 
3. The proof (z,c1, 01) = BlindLogEdqSig, (c#, 9, y,m). 
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4. The proof (cg, bz) = ProofLog,, (u, mg, 92), by testing 
bz ¢ 
Q= n( MI (mgi-*) 9). 


Since h is collision resistant and the shop checks cp = h(M|c#), Alice neces- 
sarily has to use the coin number c# in the second proof. The shop accepts 
if the coin passes the verification. 


Deposit. The shop sends the coin to the bank. The bank verifies the coin and 
searches in the database for an identical coin. If she finds an identical coin, 
she refuses to credit the shop’s account. If she finds a coin with identical first 
and different second component, the bank revokes the customer’s anonymity 
(see below). 


Coin and Owner Tracing. 


1. Coin tracing. If the bank provides the trusted center T with d = yz 
produced by Alice in the withdrawal protocol, T computes m: 


qd!*? = gg} =m. 


This value can be used to recognize the coin in the deposit protocol. 
2. Owner tracing. If the bank provides the trusted center T with the m of 
a spent coin, JT computes d: 


(mgi*)°" = (9§)*7 = yp =. 
This value can be used for searching in the withdrawal database. 


Security. 


1. Double spending. If Alice spends a coin twice (at different shops, or at the 
same shop but at different times), she produces signatures of two different 
messages. Both signatures are computed with the same commitment c# 
(the coin number). This reveals the signer’s secret, which is the blinding 
exponent s in our case (see the remarks after Protocol 4.17). Knowing 
the blinding exponent s, the bank can derive ™ = m!/*. Searching in the 
withdrawal database yields Alice’s identity. If the shop doubled the coin, 
the bank detects an identical coin in the database. Then she refuses to 
credit the shop’s account. 

2. Customer’s anonymity. The anonymity of the customer is ensured. Name- 
ly, BlindLogEdqSig,, is a perfectly blind signature scheme, as we observed 
above. Moreover, since ProofLogKq is honest-verifier zero knowledge (see 
Section 4.4.3), the bank also cannot get any information about the blind- 
ing exponent s from the ProofLogEq,, in the withdrawal protocol. Note 
that any information about s could enable the bank to link ™ with m, 
and hence the withdrawal of the coin with the deposit of the coin. The 
bank could also establish this link, if she were able to determine (without 
computing the logarithms) that 
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log 4, (mgi—') = logazy>1 (91). (4.3) 


Then she could link the proofs ProofLogEq,, (™mgz 1 NYT, d) (from the 
withdrawal transaction) and ProofLog,, (M »g2,Mgy ) (from the deposit 
transaction). However, to find out (4.3) contradicts the decision Diffie- 
Hellman assumption (see Section 4.5.3). 

3. Security for the trusted center. The trusted center makes coin and owner 

tracing possible. The tracings require that Alice correctly forms m = gig5 
and m= m/s = 91! ° 92; and that it is really y#, which she sends as d to 
the bank (in the withdrawal transaction). 
Now in the withdrawal protocol, Alice proves that ™ = gi! *go and that 
d = y3.. In the payment protocol, Alice proves that m = gig§ with 3 
known to her. It is not clear a priori that s = s or, equivalently, that 
m = m1/*, However, as we observed before in the blind signature scheme 
BlindLogEqSig,, the only way to transform 7™ into m is to choose o and 
7 at random and to compute m = mg’. From this we conclude that 
indeed 5 = s and m=™?*: 


o/s ao Fr 


1 
t!* 9) 97 = 92" 9S q". 


ne mam 7 = (g 


Hence, tT = 0 and o = § = s (by Proposition 4.21, below). 


4.5.3 Underlying Problems 


The Representation Problem. Let p and q be large primes such that q 
divides p — 1. Let G, be the subgroup of order gq in Z;.. 

Let r > 2 and let gi,...,g, be pairwise distinct generators of Gy.1° 
Then g = (g1,---,9r) € Gj is called a generator of length r. Let y € Gq. 
a= (a1,...,a,) € Zj is a representation of y (with respect to g) if 


is 
y=[[o. 
w=1 


To represent y, the elements a1,---,@,—; can be chosen arbitrarily; a,. is then 
uniquely determined. Therefore, each y € G, has q’~! different representa- 
tions. Given y, the probability that a randomly chosen a € {0,...,q}" isa 
representation of y is only 1/q. 


Proposition 4.21. Assume that it is infeasible to compute discrete loga- 
rithms in Gg. Then no polynomial algorithm can exist which, on input of a 
randomly chosen generator of length r > 2, outputs y € G, and two different 
representations of y. 


1° Note that every element of Gq except [1] is a generator of Gy (see Lemma A.40). 
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Proof. Assume that such an algorithm exists. On input of a randomly cho- 
sen generator, it outputs y € Gy, and two different representations a = 
(a1,..-,@-) and b = (b1,...,6,) of y. Then, a — b is a non-trivial repre- 
sentation of [1]. Thus, we have a polynomial algorithm A which on input of 
a randomly chosen generator outputs a non-trivial representation of [1]. We 
may use A to define an algorithm B that on input of g € G,,g F [1], and 
z € Gg, computes the discrete logarithm of z with respect to g. 


Algorithm 4.22. 

int B(int g, z) 

1 repeat 
select i € {1,...,r} and 
3 uy € {1,...,¢-1},1 <j <r, uniformly at random 
4 K-24 ,9; QM, IL< iJ Air 
5 (@1,---,@p) — A(gi,---5Gr) 
6 
7 


i) 


until a;u; # 0 mod q 
return — (a;u;)~+ aver aj) mod q 


yous [[o- 

jFt 
Hence, the returned value is indeed the logarithm of z. Since at least one a; 
returned by A is 4 0 modulo q, the probability that a; 4 0 modulo q is 1/,. 
Hence, we expect that the repeat until loop will terminate after r iterations. 
If r is bounded by a polynomial in the binary length |p| of p, the expected 
running time of B is polynomial in |p]. 


Remark. Assume there is a polynomial algorithm which, when given as input 
a generator of length r > 2, outputs y € G, and two different representations 
of y — not with certainty, but at least with some non-negligible probability. 
Then, this algorithm can be used to compute discrete logarithms in G, with 
an overwhelmingly high probability (see Exercise 4 in Chapter 6). 
The Decision Diffie-Hellman Problem. Let p and q be large primes, such 
that q¢ divides p—1. Let G, be the subgroup of order q in Z,. Let g € Gq and 
a,b € {0,...,q—1} be randomly chosen. Then, the Diffie-Hellman assumption 
(Section 4.1.2) says that it is impossible to compute g® from g® and g?. 
Let 9; = 9%, 92 = g’ and g3 be given. The decision Diffie-Hellman problem 
is to decide if 


93 = 9 
This is equivalent to deciding whether 


log, (gs) = log, (gi) log,(g2), or 
log,, (gs) = log,(g1)- 
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The decision Diffie-Hellman assumption says that no efficient algorithm ex- 
ists to solve the decision Diffie-Hellman problem if a,b and gs (gi, g2 and gs, 
respectively) are chosen at random (and independently). The decision Diffie- 
Hellman problem is random self-reducible (see the remark on page 154). If 
you can solve it with an efficient probabilistic algorithm A, then you can also 
solve it, if g € Gg is any element of G, and only gi, g2, g3 are chosen randomly. 
Namely, let g € Gg, then (9, 91, 92, 93) has the Diffie-Hellman property, if and 
only if (9°, gf, 95, 93) has the Diffie-Hellman property, with s randomly chosen 
in Zi. 

The representation problem and the decision Diffie-Hellman problem are 
studied, for example, in [Brands93]. 


Exercises 


1. Let p be a sufficiently large prime such that it is intractable to compute 
discrete logarithms in Z;. Let g be a primitive root in Z>. p and g are 
publicly known. Alice has a secret key x4 and a public key ya := g”4. 
Bob has a secret key xp and a public key yp := g*®. Alice and Bob 
establish a secret shared key by executing the following protocol (see 
[Mat TakImag6)]): 


Protocol 4.23. 
A variant of the Diffie-Hellman key agreement protocol: 


1. Alice chooses at random a, 0 < a < p— 2, sets c := g® and 
sends c to Bob. 

2. Bob chooses at random b, 0 < b < p— 2, sets d:= g? and 
sends d to Alice. 

3. Alice computes the shared key k = d®4yp® = gates, 

4. Bob computes the shared key k = c?® ya? = g@®BtPra, 


Does the protocol provide entity authentication? Discuss the security of 
the protocol. 


2. Let n := pq, where p and q are distinct primes and 21,22 € Z*. Assume 
that at least one of 2; and 22 is in QR,,. Peggy wants to prove to Vic 
that she knows a square root of x; for at least one i € {1,2} without 
revealing 7. Modify Protocol 4.5 to get an interactive zero-knowledge 
proof of knowledge. 


3. Besides interactive proofs of knowledge, there are interactive proofs for 
proving the membership in a language. The completeness and soundness 
conditions for such proofs are slightly different. Let (P,V) be an inter- 
active proof system. P and V are probabilistic algorithms, but only V is 
assumed to have polynomial running time. By P* we denote a general 
(possibly dishonest) prover. Let £ C {0,1}* (L£ is called a language). 
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Bit strings 2 € {0,1}* are supplied to (P,V) as common input. (P,V) 
is called an interactive proof system for the language L if the following 
conditions are satisfied: 

a. Completeness. If x € L, then the probability that the verifier V 

accepts, if interacting with the honest prover P, is > 3/4. 
b. Soundness. If « ¢ £, then the probability that the verifier V accepts, 
if interacting with any prover P*, is < 1/9. 

Such an interactive proof system (P,V) is (perfect) zero-knowledge if 
there is a probabilistic simulator S(V*,«), running in expected polyno- 
mial time, such that for every verifier V* (honest or not) and for every 
x € £ the distributions of the random variables S(V*, x) and (P, V*)(x) 
are equal. 
The class of languages that have interactive proof systems is denoted by 
TP. It generalizes the complexity class BPP (see Exercise 3 in Chap- 
ter 5). 
As in Section 4.3.1, let n := pq, with p and q distinct primes, and let 
Jil := {x € Z* | (2) = 1} be the units with Jacobi symbol 1. Let 
QNR** := J+! \ QR,, be the quadratic non-residues in J*?. 
The following protocol is an interactive proof system for the language 
QNR*+* (see [GolMicRac89]). The common input x is assumed to be in 
J;** (whether or not x € Z,, is in J}! can be efficiently determined using 
a deterministic algorithm; see Algorithm A.59). 


Protocol 4.24. 
Quadratic non-residuosity: 


Let x € Jt. 
1. Vic chooses r € Z* and o € {0,1} uniformly at random and 
sends a = r?x° to Peggy. 

0 if a € QR, 

lifa¢ QR, 
(Note that it is not assumed that Peggy can solve this in 
polynomial time. Thus, she can find out whether a € QR,, 
for example, by an exhaustive search.) 

3. Vic accepts if and only if o = 7. 


2. Peggy computes T := { and sends 7 to Vic. 


Show: 
a. If ¢ € QNR*' and both follow the protocol, Vic will always accept. 


nr 


b. If « ¢ QNR>", then Vic accepts with probability < 1/g. 

c. Show that the protocol is not zero-knowledge (under the quadratic 
residuosity assumption; see Remark 1 in Section 4.3.1). 

d. The protocol is honest-verifier zero-knowledge. 

e. Modify the protocol to get a zero-knowledge proof for quadratic non- 


residuosity. 
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4. Cryptographic Protocols 


We consider the identification scheme based on public-key encryption 
introduced in Section 4.2.1. In this scheme a dishonest verifier can obtain 
knowledge from the prover. Improve the scheme. 


We modify the commitment scheme based on quadratic residues. 


Protocol 4.25. 
QRCommitment: 


1. System setup. Alice chooses distinct large prime numbers 
P,q = 3mod 4 and sets n := pg. (Note —1 € Jt! \ QR,, 
see Proposition A.53.) 

2. Commit to b € {0,1}. Alice chooses r € Z* at random, sets 
c:= (—1)?r? and sends c to Bob. 

3. Reveal. Alice sends p,q,r and b to Bob. Bob can verify that 
p and q are primes = 3 mod 4, r € Z*, and ¢:= (—1)?r?. 


Show: 
a. If cis a commitment to b, then —c is a commitment to 1 — b. 
b. If c; is a commitment to b;, 7 = 1,2, then c,c2 is a commitment to 
by @ bg. 
c. Show how Alice can prove to Bob that two commitments c; and c2 
commit to equal or distinct values, without opening them. 


Let P = {P, |i = 1,...,6}. Set up a secret sharing system, such that 
exactly the groups {P;, P2},{Q Cc P | |Q| > 3, Pi © Q} and {Qc P| 
|Q| > 4, P2 € Q} are able to reconstruct the secret. 


Let P = {P,, Po, P3, Py}. Is it possible to set up a secret sharing system 
by use of Shamir’s threshold scheme, such that the members of a group 
Q C P are able to reconstruct the secret if and only if {P,, Po} C Q or 
(Pa, Pay CO? 

In the voting scheme of Section 4.4, it is necessary that each authority 
and each voter proves that he really follows the protocol. Explain why. 


Let p and q be large primes such that q divides p — 1. Let G be the 
subgroup of order g in Z>, g,h, yi, zi € G, i= 1,2. Peggy wants to prove 
to Vic that she knows an a, such that y; := g® and z; := h®” for at 
least one i € {1,2}, without revealing i. Modify Protocol 4.15 to get an 
interactive proof of knowledge. Show how the interactive proof can be 
converted into a non-interactive one. 


We consider the problem of vote duplication. This means that a voter can 
duplicate the vote of another voter who has previously posted his vote. 
He can do this without knowing the content of the other voter’s ballot. 
Discuss this problem for the voting scheme of Section 4.4. 


Blind RSA signatures. Construct a blind signature scheme based on 
the fact that the RSA function is a homomorphism. 


12. 


13. 


14. 


15. 
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Nyberg-Rueppel Signatures. Let p and q be large primes such that 
q divides p — 1. Let G be the subgroup of order gq in Z;, and let g be a 
generator of G. The secret key of the signer is a randomly chosen x € Zz, 
the public key is y := g”. 
Signing. We assume that the message m to be signed is an element in 
Z,,. The signed message is produced using the following steps: 

1. Select a random integer k, 1 <k<q-—1. 

2. Set r:= mg* and s:= ar+kmod q. 

3. (m,r, 8) is the signed message. 
Verification. If 1 <r<p—-—1,l1<s<q-—landm=ry'g“, accept 
the signature. If not, reject it. 


Show that the verification condition holds for a signed message. 
Show that it is easy to produce forged signatures. 

How can you prevent this attack? 

Show that the condition 1 < r < p—1 has to be checked to detect 
forged signatures, even if the scheme is modified as in item c. 


Poe 


Blind Nyberg-Rueppel Signatures (see also [CamPivSta94]). In the 
situation of Exercise 12, Bob gets a blind signature for a message m € 
{1,...,q—1} from Alice by executing the following protocol: 


Protocol 4.26. 
BlindNybergRueppelSig(m): 


1. Alice chooses k at random, 1 < k< q—1, and sets @ := g*. 
Alice sends @ to Bob. 

2. Bob chooses a, uniformly at random with 1 <a<q-1 
and 0 < 8 <q-—1, sets m:= ma°—!g%a7!, and sends i to 
Alice. 7 

3. Alice computes * := mg*,§ := x + k mod q, and sends F 
and s to Bob. 

4. Bob checks whether (m,7, ) is a valid signed message. If it 
is, then he sets r := ra and s := sa+ f. 


Show that (m,r,s) is a signed message and that the protocol is really 
blind. 


Proof of Knowledge of a Representation (see [Okamoto92]). Let p 
and q be large primes, such that q divides p— 1. Let G be the subgroup 
of order q in Z;, and gi and gz be independently chosen generators. The 
secret is a randomly chosen (x1, 22) € {0,...,q—1}°, and the public key 
is (p,4,91,92,y), where y := g1*'g2*? of G. 

How can Peggy convince Vic by an interactive proof of knowledge that 
she knows (21, £2), which is a representation of y with respect to (g1, g2)? 


Convert the interactive proof of Exercise 14 into a blind signature scheme. 
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Probabilistic algorithms are important in cryptography. On the one hand, 
the algorithms used in encryption and digital signature schemes often include 
random choices (as in Vernam’s one-time pad or the DSA) and therefore are 
probabilistic. On the other hand, when studying the security of cryptographic 
schemes, adversaries are usually modeled as probabilistic algorithms. The 
subsequent chapters, which deal with provable security properties, require a 
thorough understanding of this notion. Therefore, we clarify what is meant 
precisely by a probabilistic algorithm, and discuss the underlying probabilistic 
model. 

The output y of a deterministic algorithm A is completely determined by 
its input «x. In a deterministic way, y is computed from x by a sequence of 
steps decided in advance by the programmer. A behaves like a mathematical 
mapping: applying A to the same input x several times always yields the same 
output y. Therefore, we may use the mathematical notation of a mapping, 
A: X —+Y, for a deterministic algorithm A, with inputs from X and out- 
puts in Y. There are various equivalent formal models for such algorithms. 
A popular one is the description of algorithms by Turing machines (see, for 
example, [HopUII79]). Turing machines are state machines, and deterministic 
algorithms are modeled by Turing machines with deterministic behavior: the 
state transitions are completely determined by the input. 

A probabilistic algorithm A is an algorithm whose behavior is partly con- 
trolled by random events. The computation of the output y on input x de- 
pends on the outcome of a finite number of random experiments. In partic- 
ular, applying A to the same input x twice may yield two different outputs. 


5.1 Coin-Tossing Algorithms 


Probabilistic algorithms are able to toss coins. The control flow depends on 
the outcome of the coin tosses. Therefore, probabilistic algorithms exhibit 
random behavior. 


Definition 5.1. Given an input «x, a probabilistic (or randomized) algorithm 
A may toss a coin a finite number of times during its computation of the 
output y, and the next step may depend on the results of the preceding coin 
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tosses. The number of coin tosses may depend on the outcome of the previous 
ones, but it is bounded by some constant t, for a given input x. The coin 
tosses are independent and the coin is a fair one, i.e., each side appears with 
probability 1/2. 


Examples. The encryption algorithms in Vernam’s one-time pad (Section 
2.1), OAEP (Section 3.3.4) and ElGamal’s scheme (Section 3.5) include ran- 
dom choices, and thus are probabilistic, as well as the signing algorithms in 
PSS (Section 3.4.5), ElGamal’s scheme (Section 3.5.2) and the DSA (Section 
3.5.3). Other examples of probabilistic algorithms are the algorithm for com- 
puting square roots in Z; (see Algorithm A.61) and the probabilistic primality 
tests discussed in Appendix A.8. Many examples of probabilistic algorithms 
in various areas of application can be found, for example, in [MotRag95]. 


Remarks and Notations: 


1. A formal definition of probabilistic algorithms can be given by the no- 
tion of probabilistic Turing machines ({[LeeMooShaSha55]; [Rabin63]; 
[Santos69]; [Gill77]; [BalDiaGab95]).! In a probabilistic Turing machine, 
the state transitions are determined by the input and the outcome of coin 
tosses. Probabilistic Turing machines should not be confused with non- 
deterministic machines. A non-deterministic Turing machine is “able to 
simply guess the solution to the given problem” and thus, in general, is 
not something that can be implemented in practice. A probabilistic ma- 
chine (or algorithm) is able to find the solution by use of its coin tosses, 
with some probability. Thus, it is something that can be implemented in 
practice. 

Of course, we have to assume (and will assume in the following) that a 
random source of independent fair coin tosses is available. To implement 
such a source, the inherent randomness in physical phenomena can be 
exploited (see [MenOorVan96] and [Schneier96] for examples of sources 
which might be used in a computer). 

To derive perfectly random bits from a natural source is a non-trivial task. 
The output bits may be biased (i.e., the probability that 1 is emitted is 
different from 1/2) or correlated (the probability of 1 depends on the 
previously emitted bits). The outcomes of physical processes are often 
affected by previous outcomes and the circumstances that led to these 
outcomes. If the bits are independent, the problem of biased bits can be 
easily solved using the following method proposed by John von Neumann 
([von Neumann63]): break the sequence of bits into pairs, discard pairs 
00 and 11, and interpret 01 as 0 and 10 as 1 (the pairs 01 and 10 have 
the same probability). Handling a correlated bit source is more difficult. 
However, there are effective means of generating truly random sequences 


' All algorithms are assumed to have a finite description (as a Turing machine) 
which is independent of the size of the input. We do not consider non-uniform 
algorithms in this book. 


5.1 Coin-Tossing Algorithms 137 


of bits from a biased and correlated source. For example, Blum developed 
a method for a source which produces bits according to a known Markov 
chain ([Blum84]). Vazirani ([Vazirani85]) shows how almost independent, 
unbiased bits can be derived from two independent “slightly-random” 
sources. For a discussion of slightly random sources and their use in 
randomized algorithms, see [Papadimitriou94], for example. 

2. The output y of a probabilistic algorithm A depends on the input x 
and on the binary string r, which describes the outcome of the coin 
tosses. Usually, the coin tosses are considered as internal operations of the 
probabilistic algorithm. A second way to view a probabilistic algorithm 
A is to consider the outcome of the coin tosses as an additional input, 
which is supplied by an external coin-tossing device. In this view, the 
model of a probabilistic algorithm is a deterministic machine. We call the 
corresponding deterministic algorithm Ap the deterministic extension of 
A. It takes as inputs the original input « and the outcome r of the coin 
tosses. 

3. Given x, the output A(z) of a probabilistic algorithm A is not a single 
constant value, but a random variable. “A outputs y on input 2” is a 
random event, and by prob(A(z) = y) we mean the probability of this 
event. More precisely, we have 


prob(A(a2) = y) := prob({r | Ap(z,r) = y}).? 


Here a question arises: what probability distribution of the coin tosses is 
meant? The question is easily answered if, as in our definition of prob- 
abilistic algorithms, the number of coin tosses is bounded by some con- 
stant t, for a given x. In this case, adding some dummy coin tosses, if 
necessary, we may assume that the number of coin tosses is exactly ty. 
Then the possible outcomes r of the coin tosses are the binary strings of 
length t,, and since the coin tosses are independent, we have the uniform 
distribution of {0,1}'*. The probability of an outcome r is 1/9, and 


hence 
{tr | Ap(a,r) = y}| 

Qte ; 
It is sufficient for all our purposes to consider only probabilistic algo- 
rithms with a bounded number of coin tosses, for a given x. In most parts 
of this book we consider algorithms whose running time is bounded by a 
function f(|2|), where |x| is the size of the input x. For these algorithms, 
the assumption is obviously true. 

4. Given x, the probabilities prob(A(x) = y),y € Y, define a probability 
distribution on the range Y. We denote it by p4(,). The random variable 
A(x) samples Y according to the distribution p,q). 


prob(A(z) = y) = 


? If the probability distribution is determined by the context, we often do not 
specify the distribution explicitly and simply write prob(e) for the probability of 
an element or event e (see Appendix B.1). 
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The setting where a probabilistic algorithm A is executed may include 
further random events. Now, tossing a fair coin in A is assumed to be 
an independent random experiment. Therefore, the outcome of the coin 
tosses of A on input x is independent of all further random events in the 
given setting. In the following items, we apply this basic assumption. 


. Suppose that the input « € X of a probabilistic algorithm A is ran- 


domly generated. This means that a probability distribution px is given 
for the domain X (e.g. the uniform distribution). We may consider the 
random experiment “Randomly choose x € X according to px and com- 
pute y = A(x)”. If the outputs of A are in Y, then the experiment is 
modeled by a joint probability space (XY, pxy). The coin tosses of A(z) 
are independent of the random choice of x. Thus, the probability that 
x € X is chosen and that y = A(z) is 


prob(x, A(x) = y) = pxy (x,y) = px(x) - prob(A(z) = y). 


The probability prob(A(a) = y) is the conditional probability prob(y|2) 
of the outcome y, assuming the input 2. 


. Each execution of A is a new independent random experiment: the coin 


tosses during one execution of A are independent of the coin tosses in 
other executions of A. In particular, when executing A twice, with inputs 
x and x’, we have 


prob(A(x) = y, A(x’) = y’) = prob(A(«) = y) - prob(A(a’) = y’). 


If probabilistic algorithms A and B are applied to inputs x and 2’, then 
the coin tosses of A(z) and B(x’) are independent, unless B(x’) is called 
as a subroutine of A(x) (such that the coin tosses of B(x’) are contained 
in the coin tosses of A(x)), or vice versa: 


prob(A(2) = y, B(x’) = y') = prob(A(z) = y) - prob(B(2’) = y’). 


. Let A be a probabilistic algorithm with inputs from X and outputs in Y. 


Let h: X —>+ Z bea map yielding some property h(a) for the elements in 
X (e.g. the least-significant bit of «). Let B be a probabilistic algorithm 
which on input y € Y outputs B(y) € Z. Assume that a probability 
distribution px is given on X. B might be an algorithm trying to invert 
A or at least trying to determine the property h(x) from y := A(x). We 
are interested in the random experiment “Randomly choose xz, compute 
y = A(x) and B(y), and check whether B(y) = h(a)”. The random choice 
of x, the coin tosses of A(a) and the coin tosses of B(y) are independent 
random experiments. Thus, the probability that « € X is chosen, and 
that A(z) = y and B(y) correctly computes h(2) is 


prob(x, A(x) = y, B(y) = h(z)) 
= prob(2) - prob(A(x) = y) - prob(B(y) = h(z)). 
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9. Let px be a probability distribution on the domain X of a probabilistic 
algorithm A with outputs in Y. Randomly selecting an x € X and com- 
puting A(x) is described by the joint probability space XY (see above). 
We can project to Y, (a,y) + y, and calculate the probability distribu- 
tion py on Y: 


py(y):= S) pxy(x,y) = $5 px(2) - prob(A(x) = y). 


HEX cEx 


We call py the image of px under A. py is also the image of the joint 
distribution of X and the coin tosses r under the deterministic extension 
Ap of A: 


py(y) = prob({(2,r) | Av(#.r) = y}) 
= 2 px (x) « prob(r). 


rE X,r€{0,1}'2:Ap(2,r)=y 


As in the deterministic case (see Appendix B.1, p. 330), we sometimes 
denote the image distribution by {A(x) : 2 — X}. 


Let A be a probabilistic algorithm with inputs from X and outputs in Y. 
Then A (as a Turing machine) has a finite binary description. In particular, 
we can assume that both the domain X and the range Y are subsets of 
{0,1}*. The time and space complexity of an algorithm A (corresponding to 
the running time and memory requirements) are measured as functions of 
the binary length || of the input z. 


Definition 5.2. A probabilistic polynomial algorithm is a probabilistic algo- 
rithm A, such that the running time of A(z) is bounded by P(|a|), where 
P € Z[X] is a polynomial (the same for all inputs x). The running time is 
measured as the number of steps in our model of algorithms, i.e., the number 
of steps of the probabilistic Turing machine. Tossing a coin is one step in this 
model. 


Remark. The randomness in probabilistic algorithms is caused by random 
events in a very specific probability space, namely {0,1}* with the uniform 
distribution, and at first glance this might look restrictive. Actually, it is a 
very general model. 

For example, suppose you want to control an algorithm A on input x by rz 
random events with probabilities pz1,...,Dzx,r, (the deterministic extension 
Ap takes as inputs x and one of the events).? Assume that always one of 
the events occurs (ie., 57)", Pz,i = 1) and that the probabilities p,,; have a 
finite binary representation py; = Se Axi ° 27) (@x,i,j € {0,1}). Further, 


3 For example, think of an algorithm A that attacks an encryption scheme. If all 
possible plaintexts and their probability distribution are known, then A might 
be based on the random choice of correctly distributed plaintexts. 
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assume that r,t, and the probabilities p,; are computable by deterministic 
(polynomial) algorithms with input x. The last assumption is satisfied, for 
example, if the events and probabilities are the same for all zx. 

Then the random behavior of A can be implemented by coin tosses, i.e., 
A can be implemented as a probabilistic (polynomial) algorithm in the sense 
of Definitions 5.1 and 5.2. Namely, let S be the coin-tossing algorithm which 
on input 2: 


(1) ts times tosses the coin and obtains a binary number b := b;,-1...b1bo, 
with 0 < b < 2’*, and 
(2) returns S(a) := i, if 2’ i Pa,j <b < 2 YY) Pe,j- 


The outputs of S(x) are in {1,...,r2} and prob(S(#) = 7) = pyi, for 1 <i< 
rz. The probabilistic (polynomial) algorithm S can be used to produce the 
random inputs for Ap. 


5.2 Monte Carlo and Las Vegas Algorithms 


The running time of a probabilistic polynomial algorithm A is required to be 
bounded by a polynomial P, for all inputs x. Assume A tries to compute the 
solution to a problem. Due to its random behavior, A might not reach its goal 
with certainty, but only with some probability. Therefore, the output might 
not be correct in some cases. Such algorithms are also called Monte Carlo 
algorithms if their probability of success is not too low. They are distinguished 
from Las Vegas algorithms. 

time (ax) denotes the running time of A on input 2, i.e., the number of 
steps A needs to generate the output A(x) for the input x. As before, |2| 
denotes the binary length of x. 


Definition 5.3. Let P be a computational problem. 


1. A Monte Carlo algorithm A for P is a probabilistic algorithm A, whose 
running time is bounded by a polynomial Q and which yields a correct 
answer to P with a probability of at least 2/3: 


wlrw 


timea(x) < Q(|z|) and prob( A(z) is a correct answer to P) > 


9 


for all instances x of P. 

2. A probabilistic algorithm A for P is called a Las Vegas algorithm if its 
output is always a correct answer to P, and if the expected value for the 
running time is bounded by a polynomial Q: 


E(timea(x)) = ‘a t- prob(time a(x) = t) < Q(|z}), 


t=1 


for all instances x of P. 
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Remarks: 


1. The probabilities are computed assuming a fixed input 2; they are only 
taken over the coin tosses during the computation of A(«). The distribu- 
tion of the inputs x is not considered. For example, 


prob(A(za) is a correct answer to P) = S- prob(A(z) = y), 
yEYs 


with the sum taken over the set Y, of correct answers for input x. 

2. The running time of a Monte Carlo algorithm is bounded by one poly- 
nomial Q, for all inputs. A Monte Carlo algorithm may sometimes fail 
to produce a correct answer to the given problem. However, the prob- 
ability of such a failure is bounded. In contrast, a Las Vegas algorithm 
always gives a correct answer to the given problem. The running time 
may vary substantially and is not necessarily bounded by a single poly- 
nomial. However, the expected value of the running time is polynomial. 


Examples: 


1. Typical examples of Monte Carlo algorithms are the probabilistic pri- 
mality tests discussed in Appendix A.8. They check whether an integer 
is prime or not. 

2. Algorithm A.61, which computes square roots modulo a prime p, is a Las 
Vegas algorithm. 

3. To prove the zero-knowledge property of an interactive proof system, it 
is necessary to simulate the prover by a Las Vegas algorithm (see Section 
4.2.3). 


A Las Vegas algorithm may be turned into a Monte Carlo algorithm 
simply by stopping it after a suitable polynomial number of steps. To state 
this fact more precisely, we use the notion of positive polynomials. 


Definition 5.4. A polynomial P(X) = $7}, a:X' € Z[X] in one variable 
X, with integer coefficients a;,0 <i <n, is called a positive polynomial if 
P(x) > 0 for x > 0, ie., P has positive values for positive inputs. 


Examples. The polynomials X” are positive. More generally, each polynomial 
whose non-zero coefficients are positive is a positive polynomial. 


Proposition 5.5. Let A(x) be a Las Vegas algorithm for a problem P with 
an expected running time < Q(|a|) (Q a polynomial). Let P be a positive 
polynomial. Let A be the algorithm obtained by stopping A(x) after at most 
P(|z|) steps. Then A is a Monte Carlo algorithm for P, which gives a correct 
answer to P with probability > 1 — Q(\21)/p(|z}). 
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Proof. We have 


Co 


P(\2|) - prob(timea(x) > P(\a|))< S° t+ prob(time g(x) = t) 
t=P((x) 


< E(timea(x)) < Q((|z]). 
Thus, A gives a correct answer to P with probability > 1 — @({2|)/p(z}). 


Remark. A(x) might return “I don’t know” if A(z) did not yet terminate 
after P(|x|) steps and is stopped. Then the answer of A is never false, though 
it is not always a solution to P. 


The choice of the bound 2/3 in our definition of Monte Carlo algorithms is 
somewhat arbitrary. Other bounds could be used equally well, as Proposition 
5.6 below shows. We can increase the probability of success of an algorithm 
A(x) by repeatedly applying the original algorithm and by making a majority 
decision. This works if the probability that A produces a correct result exceeds 
I/y by a non-negligible amount. 


Proposition 5.6. Let P and Q be positive polynomials. Let A be a proba- 
bilistic polynomial algorithm which computes a function f : X —+ Y, with 


prob( A(x) = f(x)) > for alla eX. 


aa 
aT Fa |) 


Then, by repeating the computation A(x) and returning the most frequent 
result, we get a probabilistic polynomial algorithm A, such that 


prob(A(a) = f(x)) >1- for alla EX. 


aah |) 


Proof. Consider the algorithms A; (t € N), defined on input x € X as follows: 


1. Execute A(x) ¢ times, and get the set Y, := {yi,...,yz} of outputs. 
2. Select ani € {1,...,t}, with |{y € Y2 | y = y:}| maximal. 
3. Set A;(x) := yj. 


We expect that more than half of the results of A coincide with f(2), and 
hence A;(x) = f(x) with high probability. More precisely, define the binary 
random variables S;,1 <j <t: 


S, ae ee 


0 otherwise. 


The expected values E(S;) are equal to 


E(S;) = prob(A(a) = f(2)) 


IV 
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and we conclude, by Corollary B.18, that 


t e 2 
prob(Ar(2) = f(x)) > prob { $78; > 5] >1 alt 


j=l 


For t > 1/4: P(|a|)? - Q(|2|), the probability of success of A; is > 1 — 1/Q({z\). 


Remark. By using the Chernoff bound from probability theory, the preceding 
proof can be modified to get a polynomial algorithm A whose probability 
of success is exponentially close to 1 (see Exercise 5). We do not need this 
stronger result. If a polynomial algorithm which checks the correctness of a 
solution is available, we can do even better. 


Proposition 5.7. Let P,Q and R be positive polynomials. Let A be a proba- 
bilistic polynomial algorithm computing solutions to a given problem P with 
1 
prob(A(z) is a correct answer to P) > Pla)’ for all inputs x. 
x 
Let D be a deterministic polynomial algorithm, which checks whether a given 
answer to P is correct. Then, by repeating the computation A(x) and by 
checking the results with D, we get a probabilistic polynomial algorithm A for 
P, such that 


prob(A(z) is a correct answer to P) >1—272@(*)), for all inputs «. 


Proof. Consider the algorithms A; (t € N), defined on input « as follows: 


1. Repeat the following at most ¢ times: 
a. Execute A(x), and get the answer y. 
b. Apply D to check whether y is a correct answer. 
c. If D says “correct”, stop the iteration. 

2. Return y. 


The executions of A(z) are t independent repetitions of the same experiment. 
Hence, the probability that all t executions of A yield an incorrect answer is 
< (1 —!/p(|2|))*, and we obtain 


oe (ae 


= e-t/Pllel) < e- MQ)QUlel) — 9-QAUlel), 


for t > In(2)P(|2|)Q(|a|). 


Remark. The iterations A; in the preceding proof may be used to construct a 
Las Vegas algorithm that always gives a correct answer to the given problem 
(see Exercise 1). 
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Exercises 


1. Let P be a positive polynomial. Let A be a probabilistic polynomial 
algorithm which on input « € X computes solutions to a given problem 
P with 


prob(A(z) is a correct answer to P) > mm 1 for alla € X. 
Assume that there is a deterministic algorithm D, which checks in poly- 
nomial time whether a given solution y to P on input x is correct. 
Construct a Las Vegas algorithm that always gives the correct answer 
and whose expected running time is < P(|z|)(R(|z|) + $(|z| + R(\z]))), 
where R and S are polynomial bounds for the running times of A and D 
(use Lemma B.12). 


2. Let LC {0,1}* be a decision problem in the complexity class BPP of 
bounded-error probabilistic polynomial-time problems. This means that 
there is a probabilistic polynomial algorithm A with input x € {0,1}* 
and output A(x) € {0,1} which solves the membership decision problem 
for £ Cc {0,1}*, with a bounded error probability. More precisely, there 
are a positive polynomial P and a constant a, 0 < a < 1, such that 


prob(A(z) = 1) >a+ 


1 
, for x € £, and 
7 


prob(A(az) = 1) <a—- for « € L. 


Pah |)’ 


Let Q be another positive polynomial. Show how to obtain a probabilistic 
polynomial algorithm A, with 


~ 1 
prob(A(2) = 1) >1 , for x € £, and 
ae ota) 


prob(A(a) = 1) < for « ¢ L. 


i. aia |)’ 
(Use a similar technique as that used, for example, in the proof of Propo- 
sition 5.6.) 


3. A decision problem £ C {0,1}* belongs to the complexity class RP of 

randomized probabilistic polynomial-time problems if there exists a prob- 
abilistic polynomial algorithm A which on input « € {0,1}* outputs 
A(x) € {0,1}, and a positive polynomial Q, such that prob(A(z) = 1) > 
V/Q(\e|), for « € L, and prob(A(x) = 1) =0, for x ¢ L. 
A decision problem £ belongs to the complexity class VP if there is a 
deterministic polynomial algorithm M(z,y) and a polynomial L, such 
that M(x,y) = 1 for some y € {0,1}*, with |y| < L(|]), if and only if 
x € CL (y is called a certificate for x). Show that: 
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a. RP C BPP. 
b. RPC NP. 


4. A decision problem £ Cc {0,1}* belongs to the complexity class ZPP 
of zero-sided probabilistic polynomial-time problems if there exists a Las 
Vegas algorithm A(x), such that A(x) = 1 if « € L, and A(x) = 0 if 


cel. 
Show that ZPP C RP. 
5. If $1,...,S, are independent repetitions of a binary random variable 


X and p := prob(X = 1) = E(X), then the Chernoff bound holds for 
0<e<p(l—p): 


prob ([Exs-» 


(see, e.g., [Rényi70], Section 7.4). Use the Chernoff bound to derive an 
improved version of Proposition 5.6. 


< -) >1- Jenne /2, 


6. One-Way Functions and the Basic 
Assumptions 


In Chapter 3 we introduced the notion of one-way functions. As the examples 
of RSA encryption and Rabin signatures show, one-way functions play the 
key role in asymmetric cryptography. 

Speaking informally, a one-way function is a map f :X — Y which is 
easy to compute but hard to invert. There is no efficient algorithm that 
computes pre-images of y € Y. If we want to use a one-way function f for 
encryption in the straightforward way (applying f to the plaintext, as, for 
example, in RSA encryption), then f must belong to a special class of one- 
way functions. Knowing some information ("the trapdoor information”: e.g. 
the factorization of the modulus n in RSA schemes), it must be easy to invert 
f, and f is one way only if the trapdoor information is kept secret. These 
functions are called trapdoor functions. 

Our notion of one-way functions introduced in Chapter 3 was a rather 
informal one: we did not specify precisely what we mean by “efficiently com- 
putable”, “infeasible” or “hard to invert”. Now, in this chapter, we will clarify 
these terms and give a precise definition of one-way functions. For example, 
“efficiently computable” means that the solution can be computed by a prob- 
abilistic polynomial algorithm, as defined in Chapter 5. 

We discuss three examples in some detail: the discrete exponential func- 
tion, modular powers and modular squaring. The first is not a trapdoor func- 
tion. Nevertheless, it has important applications in cryptography (e.g. pseu- 
dorandom bit generators, see Chapter 8; ElGamal’s encryption and signature 
scheme and the DSA, see Chapter 3). 

Unfortunately, there is no proof that these functions are really one way. 
However, it is possible to state the basic assumptions precisely, which guar- 
antee the one-way feature. It is widely believed that these assumptions are 
true. 

In order to define the one-way feature (and in a way that naturally 
matches the examples), we have to consider not only single functions, but, 
more generally, families of functions defined over appropriate index sets. 

In the preliminary (and very short) Section 6.1, we introduce an intuitive 
notation for probabilities that will be used subsequently. 
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6.1 A Notation for Probabilities 


The notation 
prob(B(x#) =1:a< X) := prob({x € X | B(x) = 1}), 


introduced for Boolean predicates! B: X —> {0,1} in Appendix B.1 (p. 
329), intuitively suggests that we mean the probability of B(x) = 1 if x is 
randomly chosen from X. We will use an analogous notation for probabilistic 
algorithms. 

Let A be a probabilistic algorithm with inputs from X and outputs in 
Y, and let B: X x Y — {0,1}, (x,y) +> B(a, y) be a Boolean predicate. 
Let px be a probability distribution on X. As in Section 5.1, let Ap be the 
deterministic extension of A, and let ¢t, denote the number of coin tosses of 
A on input «: 


prob(B(x, A(x)) =1: 2% xX) 
= be prob(«) - prob( B(a, A(x)) = 1) 


xrEX 
= S5 prob(x) - prob({r € {0,1}"* | B(w, Ap(,r)) = 1)} 
xrEX 
l{r € {0, 1}"* | B(x, Ap(z,7)) = 1)}I 
= Deh ots z= : 


The notation is typically used in the following situation. A Monte Carlo algo- 
rithm A tries to compute a function f : X —+ Y, and B(z,y) := By(a,y) := 
Lif f(z) =y, and B(a,y) := By(x,y) =O if f(z) Ay. Then 


prob(A(a) = f(x): « * X) := prob(By(a, A(z)) = 1:2 X) 


is the probability that A succeeds if the input x is randomly chosen from X 
(according to px). 

We write x “ X if the distribution on X is the uniform one, and often 
we simply write x — X instead of x aks 

In cryptography, we often consider probabilistic algorithms whose domain 
X is a joint probability space X1X2...X, constructed by iteratively joining 
fibers Xj,x,...2;_, to X1...Xj—-1 (Appendix B.1, p. 328). In this case, the 
notation is 


prob(B(a1,...,@,-, A(a1,...,2,)) =1: 
Ly — X41, 2X9 — Xo 4,,03 — X3 e100) oie. Ur — Xp ey.) 


Now, the notation suggests that we mean the probability of the event 
B(a1,...,%,, A(a1,...,U,)) = 1 if first 21 is randomly chosen, then x2, then 


' Maps with values in {0,1} are called Boolean predicates. 
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x3, then .... 
A typical example is the discrete logarithm assumption in Section 6.2 (Defi- 
nition 6.1). 

The distribution 7; — Xj2,...2;_, is the conditional distribution of 
vy © Xje,..2;-1> assuming Z,...,%j;-1. The probability can be computed 
as follows (we consider r = 3 and the case where A computes a function f 
and B is the predicate A(x) = f(x)): 


prob(A(a1, %2,%3) = f (a1, 2,43) 1 U1 — X1, 22 — X2.2,,03 — X3,2,25) 


= Se prob(#1, v2, #3) - prob(A(a1, #2, 73) = f (x1, 72, @3)) 


L1,L2,L3 
= XS prob(a1) - SS prob(x2|21) 
©1EX, t2EX 1 ay 


S$ _ prob(#3| 22,21) - prob(A(a1,@2,73) = f(#1, £2, 23). 


@3EX1, 21,29 


Here prob(x2|21) (resp. prob(x3|x2,x1)) denotes the conditional probability 
of x2 (resp. x3) assuming x; (resp. x, and x2); see Appendix B.1 (p. 328). 
The last probability, prob(A(21, 72,273) = f(#1,22,%3)), is the probability 
that the coin tosses of A on input (x1, 22,23) yield the result f(21, x2, x3). 

In Section 5.1, we introduced the image py of the distribution px under 
a probabilistic algorithm A from X to Y. We have 


py(y) = prob( A(x) = y: a X) 


for each y EY. 
For each x € X, we have the distribution p4(,) on Y: 
Pa(a)(y) = prob( A(x) = y). 
PA(z) 


We write y — A(x) instead of y <—  Y. This notation suggests that y is 
generated by the random variable A(x). With this notation, we have 


prob(A(x) = f(a) : a <— X) = prob(f(z) =y:a— X,y— A(a)). 


6.2 Discrete Exponential Function 


The notion of one-way functions can be precisely defined using probabilistic 
algorithms. As a first example we consider the discrete exponential function. 
Let I := {(p,g) | pa prime number, g € Zi a primitive root}. We call the 
family of discrete exponential functions 


Exp := (Exp, , 1 Lp —> 25, G4 9") (p,g)€I 
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the Exp family. Since g is a primitive root, Exp, , is an isomorphism be- 
tween the additive group Z,_; and the multiplicative group Z>. The family 
of inverse functions 


Log := (Log, : Zi, — Zy-1) (p,g)€I 


is called the Log family. 

The algorithm of modular exponentiation computes Exp, ,(x) efficiently 
(see Algorithm A.26). It is unknown whether an efficient algorithm for the 
computation of the discrete logarithm function exists. All known algorithms 
have exponential running time, and it is widely believed that, in general, 
Log, is not efficiently computable. We state this assumption on the one- 
way property of the Exp family by means of probabilistic algorithms. 


Definition 6.1. Let I, := {(p,g) € I | |p| = k}, with k € N,? and let Q(X) € 
Z|X]| be a positive polynomial. Let A(p,g,y) be a probabilistic polynomial 
algorithm. Then there exists a ko € N, such that 


1 


prob(A(p, 9, y) = Log, 4(y) : (p,9) © Ikny © ZF) < aw 


for k > ko. 
This is called the discrete logarithm assumption. 


Remarks: 


1. The probabilistic algorithm A models an attacker who tries to compute 
the discrete logarithm or, equivalently, to invert the discrete exponential 
function. The discrete logarithm assumption essentially states that for a 
sufficiently large size k of the modulus p, the probability of A successfully 
computing Log, ,(y) is smaller than 1/Q(k). This means that Exp cannot 
be inverted by A for all but a negligible fraction of the inputs. There- 
fore, we call Exp a family of one-way functions. The term “negligible” is 
explained more precisely in a subsequent remark. 

2. When we use the discrete exponential function in a cryptographic scheme, 
such as ElGamal’s encryption scheme (see Section 3.5.1), selecting a func- 
tion Exp, ,, from the family means to choose a public key i = (p,g) 
(actually, i may be only one part of the key). 

3. The index set I is partitioned into disjoint subsets: J = U,en Je. k may 
be considered as the security parameter of ¢ = (p,g) € Ip. The one-way 
property requires a sufficiently large security parameter. The security 
parameter is closely related to the binary length of 7. Here, k = |p] is half 
the length of 7. 

4. The probability in the discrete logarithm assumption is also taken over 
the random choice of a key 7 with a given security parameter k. Hence, the 


? As usual, |p| denotes the binary length of p. 
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meaning of the probability statement is: choosing both the key 7 = (p, g) 
with security parameter k and y = g” randomly, the probability that A 
correctly computes the logarithm x from y is small. The statement is not 
related to a particular key 7. In practice, however, a public key is chosen 
and then fixed for a long time, and it is known to the adversary. Thus, we 
are interested in the conditional probability of success, assuming a fixed 
public key 7. Even if the security parameter & is very large, there may be 
keys (p,g) such that A correctly computes Log, ,(y) with a significant 
chance. However, as we will see below, the number of such keys (p, g) is 
negligibly small compared to all keys with security parameter k. Choosing 
(p,g) at random (and uniformly) from J;,, the probability of obtaining 
one for which A has a significant chance of success is negligibly small (see 
Proposition 6.3 for a precise statement). Indeed, if p— 1 has only small 
prime factors, an efficient algorithm developed by Pohlig and Hellman 
computes the discrete logarithm function (see [PohHel78}). 


Remark. In this book, we often consider families « = (€,),en of quantities 
ex © R, as the probabilities in the discrete logarithm assumption. We call 
them negligible or negligibly small if, for every positive polynomial Q € Z[X], 
there is a ko € N, such that |e,| < 1/Q(k) for k > ko. “Negligible” means that 
the absolute value is asymptotically smaller than any polynomial bound. 


Remark. In order to simplify, definitions and results are often stated asymp- 
totically (as the discrete logarithm assumption or the notion of negligible 
quantities). Polynomial running times or negligible probabilities are not spec- 
ified more precisely, even if it were possible. A typical situation is as follows. A 
cryptographic scheme is based on a one-way function f (e.g. the Exp family). 
Let g be a function that describes a property of the cryptographic scheme 
(e.g. g predicts the next bit of the discrete exponential pseudorandom bit 
generator; see Chapter 8). It is desirable that this property g cannot be effi- 
ciently computed by an adversary. Sometimes, this can be proven. Typically a 
proof runs by a contradiction. We assume that a probabilistic polynomial al- 
gorithm A; which successfully computes g with probability ¢, is given. Then, 
a probabilistic polynomial algorithm A» is constructed which calls A, as a 
subroutine and inverts the underlying one-way function f, with probability 
€g. Such an algorithm is called a polynomial-time reduction of f to g. If eo 
is non-negligible, we get a contradiction to the one-way assumption (e.g. the 
discrete logarithm assumption). 

In our example, a typical statement would be as follows. If discrete loga- 
rithms cannot be computed in polynomial time with non-negligible probabil- 
ity (i.e., if the discrete logarithm assumption is true), then a polynomial-time 
adversary cannot predict, with non-negligible probability, the next bit of the 
discrete exponential pseudorandom bit generator. 

Actually, in many cases the statement could be made more precise, by 
performing a detailed analysis of the reduction algorithm Az. The running 
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time of A» can be described as an explicit function of €1,€2 and the running 
time of A; (see, e.g., the results in Chapter 7). 


As in the Exp example, we often meet families of functions indexed on a set 
of keys which may be partitioned according to a security parameter. Therefore 
we propose the notion of indexes, whose binary lengths are measured by a 
security parameter as specified more precisely by the following definition. 


Definition 6.2. Let J = U,¢y Je be an infinite index set which is partitioned 
into finite disjoint subsets J;,. Assume that the indexes are binarily encoded. 
As always, we denote by || the binary length of i. 

TI is called a key set with security parameter k or an index set with security 
parameter k, if: 


1. The security parameter k of i € I can be derived from 7 by a deterministic 
polynomial algorithm. 
2. There is a constant m € N, such that 


ki/™ < |i] < k™ for i € Ip. 
We usually write J = (Jy)xen instead of J = Upen Ie. 
Remarks: 


1. The second condition means that the security parameter k is a measure 
for the binary length |7| of the elements i € J;,. In particular, statements 
such as: 

(1) “There is a polynomial P with ... < P(|#|)”, or 

(2) “For every positive polynomial Q, there is a kg € N, such that 

.»» S YQ({a|) for || > ko”, 

are equivalent to the corresponding statements in which |i| is replaced 

by the security parameter k. In almost all of our examples, we have 

k < |t| < 3k for i € Ip. 

2. The index set I of the Exp family is a key set with security parameter. 
As with all indexes occurring in this book, the indexes of the Exp family 
consist of numbers in N or residues in some residue class ring Z,. Unless 
otherwise stated, we consider them as binarily encoded in the natural way 
(see Appendix A): the binary encoding of x € N is its standard encoding 
as an unsigned number, and the encoding of a residue class [a] € Z,y is 
the encoding of its representative « with O<a<n-1. 

3. If 1 = Uxen Je satisfies only the second condition, then we can easily 
modify it and turn it into a key set with a security parameter, which also 
satisfies the first condition. Namely, let Ty := {(i, k) | i € Ip} and replace 
I by L:= Ugen Le: 

In the discrete logarithm assumption, we do not consider a single fixed 
key 2: the probability is also taken over the random choice of the key. In the 
following proposition, we relate this average probability to the conditional 
probabilities, assuming a fixed key. 
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Proposition 6.3. Let I = (Ip)pen be a key set with security parameter k. 
Let f = (fi : Xi — Yi)ier be a family of functions and A be a probabilistic 
polynomial algorithm with inputs i € I and x € X; and output in Y;. Assume 
that probability distributions are given on Ip and X; for all k,i (e.g. the 
uniform distributions). Then the following statements are equivalent: 


1. For every positive polynomial P, there is a kg € N, such that for all 
k= ko 
1 

P(k) 

2. For all positive polynomials Q and R, there is a ky € N, such that for all 
k>ko 


prob(A (4, x)= fila): ic Ip,uw— Xi) < 


prob ({i€ Ie | prob( AC, 2) = fila): -*)> ap}) < ay 


Proof. Let 
pi = prob(A(i, x) = fix) : x — Xj) 


be the conditional probability of success of A assuming a fixed 7. 
We first prove that statement 2 implies statement 1. Let P be a positive 
polynomial. By statement 2, there is some ko € N such that for all k > ko 


wm ((ee4 n> at) Sa 


Hence 
prob(A(i, a) = fi(x) +i — In, — Xi) 
= S- prob(2) 
i€lp 
= SZ prob(2) - pj + S- prob(?) - p; 
PiS1/(2P(k)) pi>1/(2P(k)) 
1 
< b 
Sth THT Nagy, POO 
PiS1/(2P(k) pi>1/(2P(k)) 
= prob el : : 
= prob | #& te | Pi = opr f) BPH) 
1 
Te | Pi 
Fag te Tk | pi > sap} 
< : t 
~ 2P(k) © aa ~ =" 
for k > ko. 


Conversely, assume that statement 1 holds. Let Q and R be positive poly- 
nomials. Then there is a ko € N such that for all k > ko 


154 6. One-Way Functions and the Basic Assumptions 


IV 


d= prob(i) pi 


pi>1/Q(k) 


see) 


This inequality implies statement 2. 


Remark. A nice feature of the discrete logarithm problem is that it is random 
self-reducible. This means that solving the problem for arbitrary inputs can be 
reduced to solving the problem for randomly chosen inputs. More precisely, let 
(p,9) € Ik :-= {(p,9) | pa prime, |p| = k,g € Z5 a primitive root}. Assume 
that there is a probabilistic polynomial algorithm A, such that 


prob(A(p, 9,4) = Logy.g(u) :y % Zs) > aa (6.1) 
for some positive polynomial Q; i.e., A(p, g, y) correctly computes the discrete 
logarithm with a non-negligible probability if the input y is randomly selected. 
Since y is chosen uniformly, we may rephrase this statement: A(p, g, y) cor- 
rectly computes the discrete logarithm for a polynomial fraction of inputs 
ye Ze. 

Then, however, there is also a probabilistic polynomial algorithm A which 
correctly computes the discrete logarithm for every input y € Z), with an 
overwhelmingly high probability. Namely, given y € Z>, we apply a slight 
modification A, of A. On input (p,g,y), A, randomly selects r “ Zy—1 and 
returns 


Ai(p,9,y) = (A(p, 9, yg") — 7) mod (p — 1). 


Then prob(Ai(p,9,y) = Log, ,(y)) > 1/Q(k) for every y € Z3. Now we can 
apply Proposition 5.7. For every positive polynomial P, we obtain — by re- 
peating the computation of A;(p,g,y) a polynomial number of times and by 
checking each time whether the result is correct — a probabilistic polynomial 
algorithm A, with 


prob(A(p, 9, y) = Log, g(y)) > 1-27 ?™), 


for every y € Z). The existence of a random self-reduction enhances the 
credibility of the discrete logarithm assumption. Namely, assume that the 
discrete logarithm assumption is not true. Then by Proposition 6.3 there is 
a probabilistic polynomial algorithm A, such that for infinitely many k, the 
inequality (6.1) holds for a polynomial fraction of keys (p, g); ie., 
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\{(p,g) € Ix | inequality (6.1) holds}| ™ 1 
[Zi R(k) 


(with R a positive polynomial). For these keys A computes the discrete log- 
arithm for every y € Z, with an overwhelmingly high probability, and the 
probability of obtaining such a key is > 1/p(x) if the keys are selected uni- 
formly at random. 


6.3 Uniform Sampling Algorithms 


In the discrete logarithm assumption 6.1, the probabilities are taken with 
respect to the uniform distributions on J, and Z>). Stating the assumption 
in this way, we tacitly assumed that it is possible to sample uniformly over 
J, (during key generation) and Z>, by using efficient algorithms. In practice 
it might be difficult to construct a probabilistic polynomial sampling algo- 
rithm that selects the elements exactly according to the uniform distribution. 
However, as in the present case of discrete logarithms (see Proposition 6.6), 
we are often able to find practical sampling algorithms which sample in a 
“virtually uniform” way. Then the assumptions stated for the uniform distri- 
bution, such as the discrete logarithm assumption, apply. This is shown by 
the following considerations. 


Definition 6.4. Let J = (Jz)xen be an index set with security parameter k 
(see Definition 6.2). Let X = (X;)je, be a family of finite sets: 


1. A probabilistic polynomial algorithm Sx with input 7 € J is called a 
sampling algorithm for X if Sx(j) outputs an element in X,; with a 
probability > 1—¢, for 7 € Ip, where ¢ = (€x) xen is negligible; i.e., given 
a positive polynomial Q, there is a ko such that e, < 1/Q(k) for k > ko. 

2. A sampling algorithm Sx for X is called (virtually) uniform if the distri- 
butions of S'x(j) and the uniform distributions on X; are polynomially 
close (see Definition B.22). This means that the statistical distance is 
negligibly small; i.e., given a positive polynomial Q, there is a kg such 
that the statistical distance (see Definition B.19) between the distribu- 
tion of Sx(j) and the uniform distribution on X; is < 1/Q(x), for k > ko 
and 7 € Jz. 


Remark. If S'x is a virtually uniform sampling algorithm for X = (X;)j¢7, we 
usually do not need to distinguish between the virtually uniform distribution 
of Sx(j) and the truly uniform distribution when we compute a probability 
involving « — Sx(j). Namely, consider probabilities 


prob(B;(x,y) =1:2— Sx(j),y — Yj,2), 


where (Yj,2)eex, is a family of probability spaces and B; is a Boolean pred- 
icate. Then for every positive polynomial P, there is a kg € N such that 
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| prob(B;(z, y) =lige Sx(j),y = Yj,a) 
u 1 
— prob(Bj(0,y) =1 22 * Xj,9— Ya) |< pay: 
for k > kg and j € Jy (by Lemmas B.21 and B.24), and we see that the 
difference between the probabilities is negligibly small. 

Therefore, we usually do not distinguish between perfectly and virtually 
uniform sampling algorithms and simply talk of uniform sampling algorithms. 


We study an example. Suppose we want to construct a uniform sam- 
pling algorithm S for (Zn)nen. We have Z, C {0,1}!"!, and could pro- 
ceed as follows. We toss the coin |n| times and obtain a binary number 
© = Opj-1---b1b0, withO <a < 2l"|. We can easily verify whether x € Zn, 
by checking x < n. If the answer is affirmative, we return S(n) := x. Oth- 
erwise, we repeat the coin tosses. Since S$ is required to have a polynomial 
running time, we have to stop after at most P(|n|) iterations (P a polyno- 
mial). Thus, $(n) does not always succeed to return an element in Z,. The 
probability of a failure is, however, negligibly small.* 

Our construction, which derives a uniform sampling algorithm for a sub- 
set, works, if the membership in this subset can be efficiently tested. It can 
be applied in many situations. Therefore, we state the following lemma. 


Lemma 6.5. Let J = (Jx)nen be an index set with security parameter k. Let 
X =(X;)jez and Y = (Y;)j;e7 be families of finite sets with Y; C X; for all 
j € J. Assume that there is a polynomial Q, such that |Y;|-Q(k) > |X,| for 
JE Sh. 

Let Sx be a uniform sampling algorithm for (X;)je7 which on input j € 
J outputs x € X;* and some additional information aux(x) about x. Let 
A(j, x, aux(x)) be a Monte Carlo algorithm which decides the membership in 
Y;; t.e., on input 7 € J,x € X; and auz(zx), it yields 1 if x € Y;, and 0 if 
x ¢ Y;. Assume that the error probability of A is negligible; i.e., for every 
positive polynomial P, there is a ko such that the error probability is < 1/P(k) 
fork => ko. 
Then there exists a uniform sampling algorithm Sy for (Yj)je.- 


Proof. Let Sy be the probabilistic polynomial algorithm which on input 7 € J 
repeatedly computes x := Sx(j) until A(j, x, aua(x)) = 1. To get a polyno- 
mial algorithm, we stop Sy after at most In(2)kQ(k) iterations. 

We now show that Sy has the desired properties. We first assume that 
Sx(j) € X; with certainty and that A has an error probability of 0. By 
Lemma B.10, we may also assume that Sy has found an element in Y; (before 


3 We could construct a Las Vegas algorithm in this way, which always succeeds. 
See Section 5.2. 

* Here and in what follows we use this formulation, though the sampling algorithm 
may sometimes yield elements outside of X;. However, as stated in Definition 
6.4, this happens only with a negligible probability. 
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being stopped), because this event has a probability > 1—(1—/an))*@ > 
1 — 2-*, which is exponentially close to 1 (see the proof of Proposition 5.7 
for an analogous estimate). 

By construction, we have for V C Y; that 


_ prob(Sx(j) € V) 
prob(Sx(j) € Yj)” 


prob(Sy(j) € V) = prob(Sx(j) € V| Sx) € Yj) 


Thus, we have for all subsets V C Y; that 


|V| 
1 + €;(V) 
XS j 
prob($y(j) €V) = 4 
eT + 5%) 
with a negligibly small function ¢;. Then prob(Sy(j) € V) — al is also 


negligibly small (you can see this immediately by Taylor’s formula for the 
real function (x,y) > %/y). Hence, Sy is a uniform sampling algorithm for 
(Yj )jeu- 

The general case, where Sx(j) ¢ X; with a negligible probability and A 
has a negligible error probability, follows by applying Lemma B.10. 


Example. Let Yn := Zy, or Y, := Zs. Then Y, is a subset of X, := 
{0,1}!"|,n € N. Obviously, {0,1}!"! can be sampled uniformly by |n| coin 
tosses. The membership of x in Zp, is checked by x < n, and the Euclidean 
algorithm tells us whether x is a unit. Thus, there are (probabilistic polyno- 
mial) uniform sampling algorithms for (Z,) nen and (Z*) nen, which on input 
n € N output an element x € Z,, (or x € Z*). 

To apply Lemma 6.5 in this example, let J := Nand Jy := {n € N | |n| = &}. 


Example. Let Primes, be the set of primes p whose binary length |p| is k; 
ie., Primes, := {p € Primes | |p| = k} C {0,1}*. The number of primes 
<2 iss 2, In(2) (Theorem A.68). By iterating a probabilistic primality 
test (e.g. Miller-Rabin’s test, see Appendix A.8), we can, with a probability 
> 1-2-*, correctly test the primality of an element x in {0,1}*. Thus, there 
is a (probabilistic polynomial) uniform sampling algorithm S$ which on input 
1* yields a prime p € Primes. 

To apply Lemma 6.5 in this example, let J, := {1*} and J :=N = Ugen Jk: 
i.e., the index set is the set of natural numbers. However, an index k € N is 
not encoded in the standard way; it is encoded as the constant bit string 1” 
(see the subsequent remark on 1*). 


Remark. 1* denotes the constant bit string 11...1 of length k. Using it as 
input for a polynomial algorithm means that the number of steps in the 
algorithm is bounded by a polynomial in k. If we used & (encoded in the 
standard way) instead of 1* as input, the bound would be a polynomial in 
log, (k). 
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We return to the example of discrete exponentiation. 


Proposition 6.6. Let I := {(p,g) | p prime number, g € Z;, primitive root} 
and I, := {(p,g) € I | |p| = k}. There is a probabilistic polynomial uniform 
sampling algorithm for I = (Ip)pen, which on input 1” yields an element 
(p, 9) € Ik. 


Proof. We want to apply Lemma 6.5 to the index set J :-= N = Ujen Jes Je = 
{1*}, and the families of sets X;, := Primes; x {0,1}*, Y, := I, C Xn (k € N). 
The number of primitive roots in Z> is y(p—1), where ¢ is the Eulerian totient 
function (see Theorem A.36). For « € N, we have 


oo)==[] (1- =) == [2 


ia 


where p,...,pr are the primes dividing 2 (see Corollary A.30). Since 
lis pl S ies 1 = =a and r+ 1 < |a|, we immediately see that 
v(x) - |x| > «.> In particular, we have y(p—1)-k > p—1 > 2°71 for 
p © Primes,;, and hence 2k - |Y;| > |Xx|. 

Given a prime p € Primes, and all prime numbers q,...,q, dividing 
p— 1, we can efficiently verify whether g € {0,1}* is in Z* and whether it is 
a primitive root. Namely, we first test g < p and then apply the criterion for 
primitive roots (see Algorithm A.39), i.e., we check whether g®~))/4 1 for 
all prime divisors q of p — 1. 

We may apply Lemma 6.5 if there is a probabilistic polynomial uniform 
sampling algorithm for (Primes,),en which not only outputs a prime p, but 
also the prime factors of p — 1. Bach’s algorithm (see [Bach88]) yields such 
an algorithm: it generates uniformly distributed k-bit integers n, along with 
their factorization. We may repeatedly generate such numbers n until n+ 1 
is a prime. 


6.4 Modular Powers 


Let I := {(n,e) |n = pq, p # q primes, 0 < e < y(n),e prime to y(n)}. The 
family 
RSA := (RSAn,e : Zi —> Zp, +> £°) meyer 


is called the RSA family. 
Consider an (n,e) € I, and let d € Zin) be the inverse of e mod y(n). 


Then we have 2°¢ = e¢modo(n) — g for x € Z*, since c?™ = 1 (see 


Proposition A.25). This shows that RSA,,. is bijective and that the inverse 

function is also an RSA function, namely RSA,¢@: Z*, —> Z*, x x@. 

© Actually, v(x) is much closer to x. It can be shown that v(x) > Sioaclap (see 
Appendix A.2). 
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RSA,,¢ can be computed by modular exponentiation, an efficient algo- 
rithm. d can be easily computed by the extended Euclidean algorithm A.5, 
if y(n) = (p — 1)(q—1) is known. No algorithm to compute RSA; t in poly- 
nomial time is known, if p,q and d are kept secret. We call d (or p,q) the 
trapdoor information for the RSA function. 

All known attacks to break RSA, if implemented by an efficient algorithm, 
would deliver an efficient algorithm for factoring n. All known factoring al- 
gorithms have exponential running time. Therefore, it is widely believed that 
RSA cannot be efficiently inverted. The following assumption makes this 
more precise. 


Definition 6.7. Let I, := {(n,e) € I | n = pq, |p| = |¢| = k}, with k EN, 
and let Q(X) € Z[X] be a positive polynomial. Let A(n, e, y) be a probabilis- 
tic polynomial algorithm. Then there exists a ko € N, such that 
u u 1 
prob( A(n, €, y) = RSAna(y) : (n, e) ca Ik, y— Zi) < Q(k) 
for k > ko. 
This is called the RSA assumption. 


Remarks: 


1. The assumption states the one-way property of the RSA family. The algo- 
rithm A models an adversary, who tries to compute « = RSA, a(y) from 
y = RSA, (x) = x (in Z*) without knowing the trapdoor information 
d. By using Proposition 6.3, we may interpret the RSA assumption in 
an analogous way to the discrete logarithm assumption (Definition 6.1). 
The fraction of keys (n,e) in I,, for which the adversary A has a signifi- 
cant chance to succeed, is negligibly small if the security parameter k is 
sufficiently large. 

2. RSA. is bijective, and its range and domain coincide. Therefore, we 
also speak of a family of one-way permutations (or a family of trapdoor 
permutations). 

3. Here and in what follows, we restrict the key set I of the RSA family and 
consider only those functions RSA,,,-, where n = pq is the product of two 
primes of the same binary length. Instead, we could also define a stronger 
version of the assumption, where Jj, is the set of pairs (n,e), with |n| =k 
(the primes may have different length). However, our statement is closer 
to normal practice. To generate keys with a given security parameter k, 
usually two primes of length k are chosen and multiplied. 

4. The RSA problem — computing x from x* — is random self-reducible (see 
the analogous remark on the discrete logarithm problem on p. 154). 


Stating the RSA assumption as above, we assume that the set I of keys 
can be uniformly sampled by an efficient algorithm. 
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Proposition 6.8. There is a probabilistic polynomial uniform sampling al- 
gorithm for I = (Ip)ken, which on input 1" yields a key (n,e) € Ip, along 
with the trapdoor information (p,q, d). 


Proof. Above (see the examples after Lemma 6.5), we saw that Primes; can 
be uniformly sampled by a probabilistic polynomial algorithm. Thus, there 
is a probabilistic polynomial uniform sampling algorithm for 


(X; = {n = pq| p,q distinct primes , |p| = || = k} x {0,1}?")xen. 


In the proof of Proposition 6.6, we observed that |z|-y(a) > x, and we 
immediately conclude that 
y(n) n n oe oe8 


Te > > > > - ; 
Mom = 0) > Tory] = Tlf = ae = aE TOR? 


Thus, we can apply Lemma 6.5 to Y, := Ip GC Xp and obtain the desired 
sampling algorithm. It yields (p,q,e). The inverse d of e in TA Gis can be 


computed using the extended Euclidean algorithm (Algorithm A.5). 
Remarks: 


1. The uniform sampling algorithm for (I,),~en which we derived in the 
proof of Proposition 6.8 is constructed by the method given in Lemma 
6.5. Thus, it chooses triples (p,q,e) uniformly and then tests whether 
e < y(n) = (p— 1)(q— 1) and whether e is prime to y(n). If this test 
fails, a new triple (p,q, e) is selected. It would be more natural and more 
efficient to first choose a pair (p,q) uniformly, and then, with n = pq 
fixed, to choose an exponent e uniformly from Zo (n)* Then, however, the 
statistical distance between the distribution of the elements (n,e) and 
the uniform distribution is not negligible. The sampling algorithm is not 
uniform. Note that even for fixed k, there is a rather large variance of the 
cardinalities Ze (nl Nevertheless, this more natural sampling algorithm 
is an admissible key generator for the RSA family; i.e., the one-way con- 
dition is preserved if the keys are sampled by it (see Definition 6.13, of 
admissible key generators, and Exercise 1). 

An analogous remark applies to the sampling algorithm, given in the 
proof of Proposition 6.6. 

2. We can generate the primes p and q by uniformly selecting numbers of 
length k and testing their primality by using a probabilistic prime num- 
ber test (see the examples after Lemma 6.5). There are also other very 
efficient algorithms for the generation of uniformly distributed primes 
(see, e.g., [Maurer95]). 
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6.5 Modular Squaring 


Let I := {n|n= pq, p,q distinct prime numbers, |p| = |g|}. The family 


Sq := (Sq, : Z* — Z 


2 
nr & v )ner 


is called the Square family.© Sq,, is neither injective nor surjective. If 
Sq, '(x) #0, then |Sq;,'(a)| = 4 (see Proposition A.62). 

Modular squaring can be done efficiently. Square roots modulo p are com- 
putable by a probabilistic polynomial algorithm if p is a prime number (see Al- 
gorithm A.61). Applying the Chinese Remainder Theorem (Theorem A.29), 
it is then easy to derive an efficient algorithm that computes square roots in 
Z* if n = pq (p and q are distinct prime numbers) and if the factorization of 
n is known. 

Conversely, given an efficient algorithm for computing square roots in Z*, 
an efficient algorithm for the factorization of n can be derived (see Proposition 
A.64). 

All known factoring algorithms have exponential running time. Therefore, 
it is widely believed that the factors of n (or, equivalently, square roots mod- 
ulo n) cannot be computed efficiently. We make this statement more precise 
by the following assumption. 


Definition 6.9. Let I, := {n € I | n = pq,|p| = |q| = &}, with k EN, 
and let Q(X) € Z[X] be a positive polynomial. Let A(n) be a probabilistic 
polynomial algorithm. Then there exists a ky € N, such that 


7 1 
prob(A(n) =p:n<— Ix) < ——~ 
(A(n) \< om 
for k > ko. 
This is called the factoring assumption. 


Stating the factoring assumption, we again assume that the set I of keys 
may be uniformly sampled by an efficient algorithm. 


Proposition 6.10. There is a probabilistic polynomial uniform sampling al- 
gorithm for I = (In)nen, which on input 1" yields a number n € Ip, along 
with its factors p and q. 


Proof. The algorithm chooses at random integers p and q with |p| = |q| = k, 
and applies a probabilistic primality test (see Appendix A.8) to check whether 
p and q are prime. By repeating the probabilistic primality test sufficiently 


® As above in the RSA family, we only consider moduli n which are the product 
of two primes of equal binary length; see the remarks after the RSA assumption 
(Definition 6.7). 
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often, we can, with a probability > 1—2-*, correctly test the primality of an 
element x in {0,1}*. This sampling algorithm is uniform (Lemma 6.5). 


Restricting the range and the domain to the set QR,, of squares modulo 
n (called the quadratic residues modulo n, see Definition A.48), the modu- 
lar squaring function can be made bijective in many cases. Of course, each 
x € QR,, has a square root. If p and q are distinct primes with p,q = 3 mod 4 
and n := pq, then exactly one of the four square roots of x € QR,, is an ele- 
ment in QR,, (see Proposition A.66). Taking as key set 


I:={n|n= pq, p,q distinct prime numbers, |p| = |¢|,p,¢ = 3 mod 4}, 


we get a family 
Square := (Square, : QR,, — QR,,, © @7)nez 


of bijective functions, also called the Square family. Since the range and 
domain are of the same set, we speak of a family of permutations. The family 
of inverse maps is denoted by 


Sqrt := (Sart, : QR, — QR,,)ner. 


Sqrt,, maps x to the square root of x which is an element of QR,. 

The same considerations as those on Sq,, : Z* —> Z* above show that 
Square,, is efficiently computable, and that computing Sqrt,, is equivalent 
to factoring n. Square is a family of trapdoor permutations with trapdoor 
information p and q. 


6.6 Quadratic Residuosity Property 


If p is a prime and x € Z*, the Legendre symbol | = ) tells us whether z is a 
p 8 y Pp 


quadratic residue modulo p: (s) =1if« €QR,, and (z) =-1lifa¢ QR, 


(Definition A.51). The Legendre symbol can be easily computed using Euler’s 
criterion (Proposition A.52): (s) = -1)/2 mod p. 

Now, let p and q be distinct prime numbers and n := pq. The Jacobi 
symbol (#) is defined as (#) := () : (z) (Definition A.55). It is efficiently 
computable for every element x € Z* — without knowing the prime factors p 
and q of n (see Algorithm A.59). The Jacobi symbol cannot be used to decide 
whether x € QR,,. If (2) = —1, then z is not in QR,,. However, if (2) =. 
both cases x € QR,, and x ¢ QR,, are possible. x € Z* is a quadratic residue 
if and only if both x mod p € Z and x mod q € Zj are quadratic residues, 


which is equivalent to (5) = (s) =i. 


Let I := {n | n= pq, p,q distinct prime numbers, |p| = |gq|} and let 
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(2) =} 


be the elements with Jacobi symbol +1. QR,, is a proper subset of Jt. 
Consider the functions 


Jot {« er 


1 if « € QR,, 


PQR,, : Jz* — {0,1}, PQR,,(x) = e otherwise. 


The family PQR := (PQR,,),,¢, is called the quadratic residuosity family. 

It is believed that there is no efficient algorithm which, without knowing 
the factors of n, is able to decide whether x € Jj‘! is a quadratic residue. We 
make this precise in the following assumption. 


Definition 6.11. Let I, := {n € I | n = pq,|p| = |q| = &}, with k EN, 
and let Q(X) € Z[X] be a positive polynomial. Let A(n, x) be a probabilistic 
polynomial algorithm. Then there exists a ko € N, such that 

1 


* OH 


Nl] rR 


prob(A(n,z) = PQR, (2): n“ ix & Jt) < 


for k > ko. 
This is called the quadratic residuosity assumption. 


Remark. The assumption states that there is not a significant chance of com- 
puting the predicate PQR,,, if the factors of n are secret. It differs a little from 
the previous assumptions: the adversary algorithm A now has to compute a 
predicate. Since exactly half of the elements in J*! are quadratic residues (see 
Proposition A.65), A can always predict the correct value with probability 
1/9, simply by tossing a coin. However, her probability of success is at most 
negligibly more than 1/2. 


Remark. The factoring assumption follows from the RSA assumption and 
also from the quadratic residuosity assumption. Hence, each of these two 
assumptions is stronger than the factoring assumption. 


6.7 Formal Definition of One-Way Functions 


As our examples show, one-way functions actually are families of functions. 
We give a formal definition of such families. 


Definition 6.12. Let J = (Iy)zen be a key set with security parameter k. 
Let K be a probabilistic polynomial sampling algorithm for J, which on input 
1* outputs i € Ip. 
A family 
f= (fi: Di — Ridier 
of functions between finite sets D; and R; is a family of one-way functions 
(or, for short, a one-way function) with key generator K if and only if: 
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1. f can be computed by a Monte Carlo algorithm Fi, x). 

2. There is a uniform sampling algorithm S for D := (Dj)ier, which on 
input 2 € J outputs x € Dj. 

3. f is not invertible by any efficient algorithm if the keys are generated 
by K. More precisely, for every positive polynomial Q € Z[X] and every 
probabilistic polynomial algorithm A(i,y) (¢ € I,y € R;), there is a 
ko € N, such that for all k > ko 

prob(fi(A(i, fila))) = fla) :i — KON),2 & Dy) < ao. 

If K is a uniform sampling algorithm for J, then we call f a family of 
one-way functions (or a one-way function), without explicitly referring to a 
key generator. 

If f; is bijective for alli € I, then f is called a bijective one-way function, 
and if, in addition, the domain D; coincides with the range R; for all 7, we 
call f a family of one-way permutations (or simply a one-way permutation). 


Examples: The examples studied earlier in this chapter are families of one- 
way functions, provided Assumptions 6.1, 6.7 and 6.9 are true: 


1. Discrete exponential function. 
2. Modular powers. 
3. Modular squaring. 


Our considerations on the “random generation of the key” above (Proposi- 
tions 6.6 and 6.8, 6.10) show that there are uniform key generators for these 
families. There are uniform sampling algorithms S' for the domains Z,_, and 
Z*, as we have seen in the examples after Lemma 6.5. Squaring a uniformly 
selected x € Z*, x + 2”, we get a uniform sampling algorithm for the do- 
mains QR,, of the Square family. 

Modular powers is a one-way permutation, as well as the modular squaring 
function Square. The discrete exponential function is a bijective one-way 
function. 


Remarks: 


1. Selecting an index i, for example, to use f; as an encryption function, is 
equivalent to choosing a public key. Recall from Definition 6.2 that for 
i € Ix, the security parameter k& is a measure of the key length in bits. 

2. Condition 3 means that pre-images of y := f;(x) cannot be computed in 
polynomial time if x is randomly and uniformly chosen from the domain 
(all inputs have the same probability), or equivalently, if y is random 
with respect to the image distribution induced by f. f is only called a 
one-way function if the random and uniform choice of elements in the 
domain can be accomplished by a probabilistic algorithm in polynomial 
time (condition 2). 
Definition 6.12 can be immediately generalized to the case of one-way 
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functions, where the inputs x € Dj; are generated by any probabilistic 
polynomial — not necessarily uniform — sampling algorithm S' (see, e.g., 
[Goldreich01]). The distribution « “ D; is replaced by « — S(i). 

In this book, we consider only families of one-way functions with uni- 
formly distributed inputs. The keys generated by the key generator K 
may be distributed in a non-uniform way. 

. The definition can be easily extended to formally define families of trap- 
door functions (or, for short, trapdoor functions). We only sketch this 
definition. A bijective one-way function f = (f;)ic7 is a trapdoor func- 
tion if the inverse family f~! := (f;');er can be computed by a Monte 
Carlo algorithm F'~!(i, t;, y), which takes as inputs the (public) key i, the 
(secret) trapdoor information t; for f; and a function value y := f;(«). It 
is required that the key generator K generates the trapdoor information 
t; along with 2. 

The RSA and the Square families are examples of trapdoor functions (see 
above). 

. The probability of success of the adversary A in the “one-way condition” 
(condition 3) is taken over the random choice of a key of a given security 
parameter k. It says that over all possibly generated keys, A has on aver- 
age only a small probability of success. An adversary A usually knows the 
public key 2 when performing her attack. Thus, in a concrete attack the 
probability of success is given by the conditional probability assuming a 
fixed 2, and this conditional probability might be high even if the average 
probability is negligibly small, as stated in condition 3. However, accord- 
ing to Proposition 6.3, the probability of such insecure keys is negligibly 
small. Thus, when randomly generating a key i by K, the probability 
of obtaining one for which A has a significant chance of succeeding is 
negligibly small (see Proposition 6.3 for a precise statement). 

. Condition 2 implies, in particular, that the binary length of the elements 
in D; is bounded by the running time of S(i), and hence is < P(|i|) if P 
is a polynomial bound for the running time of S. 

. In all our examples, the one-way function can be computed using a deter- 
ministic polynomial algorithm. Computable by a Monte Carlo algorithm 
(see Definition 5.3) means that there is a probabilistic polynomial algo- 
rithm F(t, 2) with 


prob(F(i,z) = fi(z))>1—-2-* (eh) 


(see Proposition 5.6 and Exercise 5 in Chapter 5). 

. Families of one-way functions, as defined here, are also called collections 
of strong one-way functions. They may be considered as a single one-way 
function {0,1}* —> {0,1}*, defined on the infinite domain {0,1}* (see 
[GolBel01]; [Goldreich01]). For the notion of weak one-way functions, see 
Exercise 3. 


166 6. One-Way Functions and the Basic Assumptions 


The key generator of a one-way function f is not uniquely determined: 
there are more suitable key generation algorithms (see Proposition 6.14 be- 
low). We call them “admissible generators”. 


Definition 6.13. Let f = (fi: Di; — Ri)ier, I = (Ik) zen, be a family of 
one-way functions with key generator kK. A probabilistic polynomial algo- 
rithm K that on input 1” outputs a key i € Ip is called an admissible key 
generator for f if the one-way condition 3 of Definition 6.12 is satisfied for 
K. 


Proposition 6.14. Let f = (fj: Di — Ri)ier, I = Uk)ren, be a family of 
one-way functions with key generator K. Let K be a probabilistic polynomial 
sampling algorithm for I, which on input 1" yields i € I,. Assume that the 
family of distributions i — K(1*) is polynomially bounded by the family of 
distributions i — K(1*) (see Definition B.25). 

Then K is also an admissible key generator for f. 


Proof. This is a consequence of Proposition B.26. Namely, apply Proposition 
B26 tod = Na= Ueda 1} Xe 4s) we dpe]. Di} and 
the probability distributions (i — K(1*),2 “ D,) and (i — K(1*),2 “ Dj), 
k € N. The first family of distributions is polynomially bounded by the 
second. Assume as event &; that f;(A(é, fi(a))) = fila). 


Example. Let f = (f; : Di —> Ri)ier be a family of one-way functions (with 
uniform key generation), and let J C I with |J,| -Q(k) > |Jx|, for some 
polynomial Q. Let K be a uniform sampling algorithm for J. Then i — K(1*) 
is polynomially bounded by the uniform distribution i “ J. Thus, K is an 
admissible key generator for f. This fact may be restated as: 

f =(fi: Di — Ri)ics is also a one-way function. 


Example. As a special case of the previous example, consider the RSA one- 
way function (Section 6.4). Take as keys only pairs (n,e) € J, (notation as 
above), with e a prime number in Zn): Since the number of primes in Zy(n) 


is of the same order as ¥(")/p (by the Prime Number Theorem, Theorem 
A.68), we get an admissible key generator in this way. In other words, the 
classical RSA assumption (Assumption 6.7) implies an RSA assumption, with 
(n,e) “ I, replaced by (n,e) “ Jy, where Jy, := {(n,e) € I; | e prime}. 


Example. As already mentioned in Section 6.4, the key generator that first 
uniformly chooses an n = pq and then, in a second step, uniformly chooses 
an exponent e € Z* (n) is an admissible key generator for the RSA one-way 
function. The distribution given by this generator is polynomially bounded 
by the uniform distribution. 

Similarly, we get an admissible key generator for the discrete exponential 
function (Section 6.2) if we first uniformly generate a prime p (together with 
a factorization of p—1) and then, for this fixed p, repeatedly select g a Zi, 
until g happens to be a primitive root (see Exercise 1). 
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6.8 Hard-Core Predicates 


Given a one-way function f, it is impossible to compute a pre-image x from 
y = f(x) using an efficient algorithm. Nevertheless, it is often easy to derive 
single bits of the pre-image x from f(x). For example, if f is the discrete 
exponential function, the least-significant bit of x is derived from f(x) in a 
straightforward way (see Chapter 7). On the other hand, since f is one way 
there should be other bits of the pre-image x, or more generally, properties 
of x stated as Boolean predicates, which are very hard to derive from f(x). 
Examples of such hard-core predicates are studied thoroughly in Chapter 7. 


Definition 6.15. Let I = (1,)zen be a key set with security parameter k. Let 
f =(fi: Di — Ri)ier be a family of one-way functions with key generator 
K, and let B = (B; : Dj — {0,1})ier be a family of Boolean predicates. 

B is called a family of hard-core predicates (or, for short, a hard-core predi- 
cate) of f if and only if: 


1. B can be computed by a Monte Carlo algorithm A; (i, 2). 

2. B(x) is not computable from f(#) by an efficient algorithm; i.e., for 
every positive polynomial Q € Z[X] and every probabilistic polynomial 
algorithm A2(i,y), there is a ko € N such that for all k > ko 

u 1 

Di)<=+ 


prob(Ao(i, f;(x)) = B(x): i — K(1*), a « 5 _ 


Remarks: 


1. As above in the one-way conditions, we do not consider a single fixed 
key in the probability statement of the definition. The probability is 
also taken over the random generation of a key 7, with a given security 
parameter k. Hence, the meaning of statement 2 is: choosing both the key 
i with security parameter k and x € D; randomly, the adversary Az does 
not have a significantly better chance of finding out the bit B;(x) from 
fi(x) than by simply tossing a coin, provided the security parameter is 
sufficiently large. In practice, the public key is known to the adversary, 
and we are interested in the conditional probability of success assuming 
a fixed public key 7. Even if the security parameter k is very large, there 
may be keys i, such that Ag has a significantly better chance than 1/2 
of determining B;(x). The probability of such insecure keys is, however, 
negligibly small. This is shown in Proposition 6.17. 

2. Considering a family of one-way functions with hard-core predicate B, 
we calla key generator K admissible only if it guarantees the “hard-core” 
condition 2 of Definition 6.15, in addition to the one-way condition 3 of 
Definition 6.12. Proposition 6.14 remains valid for one-way functions with 
hard-core predicates. This immediately follows from Proposition 6.17. 


The inner product bit yields a generic hard-core predicate for all one-way 
functions. 
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Theorem 6.16. Let f = (fj: Di — Ri)ier be a family of one-way func- 
tions, D; C {0,1}* for alli € I. Extend the functions f, to functions f; 
which on input x € D; and y € {0,1}!*! return the concatenation f;(x)||y. Let 


l 
By(a,y) = |} > 2;-y, | mod 2, 


j=1 


where l = \al ae 24 nly, Y= Oreste By; SAU Th 


be the inner product modulo 2. Then B := (B;)ier is a hard-core predicate of 
f = (fidier- 


For a proof, see [GolLev89], [Goldreich99] or [Luby96]. 


Proposition 6.17. Let I = (In)enen be a key set with security parameter k. 
Let f = (fi : X; — Yi)icer be a family of functions between finite sets, and 
let B = (B,: X; — {0,1})ier be a family of Boolean predicates. Assume 
that both f and B can be computed by a Monte Carlo algorithm, with inputs 
a7 € 1 anda € X;. Let probability distributions be given on I, and X; for 
all k andi. Assume there is a probabilistic polynomial sampling algorithm S 
for X := (Xi)ier which, on input i € I, randomly chooses an « € Xj; with 
respect to the given distribution on X;, 1.e., prob(S(t) = x) = prob(x). Then 
the following statements are equivalent: 

1. For every probabilistic polynomial algorithm A with inputs i € I and 


y € Y; and output in {0,1}, and every positive polynomial P, there is a 
ko € N, such that for all k > ko 


prob(A(i, fi(a)) = Bi(a):i — Ip, aw — Xi) < ; at 


2. For every probabilistic polynomial algorithm A with inputs i € I and 
x € X; and output in {0,1}, and all positive polynomials Q and R, there 
is ako EN, such that for all k > ko 


prob (i ek, | prob( A(i, fi(x)) = Bi(x): «— Xi) > 5 ul On \) 


1 


<a. 

(k) 
Statement 2 implies statement 1, even if we omit “for every algorithm” in both 
statements and instead consider a fixed probabilistic polynomial algorithm A. 
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Proof. An analogous computation as in the proof of Proposition 6.3 shows 
that statement 2 implies statement 1. 

Now, assume that statement 1 holds, and let A(i,y) be a probabilistic 
polynomial algorithm with output in {0,1}. Let Q and R be positive polyno- 
mials. We abbreviate the probability of success of A, conditional on a fixed 
t, by Pi: 

pi := prob(A(i, fi(x)) = Bi(a) : 7 — X;). 


Assume that for k in an infinite subset K CN, 


prob ({i E In| px > ae aa}) > rar (6.2) 


If all p;,i € Iz, were > 1/2, we could easily conclude that their average also 
significantly exceeds 1/2 and obtain a contradiction to statement 1. Unfor- 
tunately, it might happen that many of the probabilities p; are significantly 
larger than 1/2 and many are smaller than !/2, whereas their average is close 
to 1/2. The basic idea now is to modify A and replace A by an algorithm A. 
We want to replace the output A(é, y) by its complementary value 1 — A(i, y) 
if p; < 1/2. In this way we could force all the probabilities to be > l/2. At 
this point we face another problem. We want to define A as 


ey) {i ~ Ali, y) if pi < 1/2, 


and see that we have to determine by a polynomial algorithm whether p; > 
I/p, At least we can compute the correct answer to this question with a high 
probability in polynomial time. By Proposition 6.18, there is a probabilistic 
polynomial algorithm C(7), such that 


; 1 1 
a (icv Pals mR) = 1— TOUR 
We define that Sign(i) := +1 if C(i) > V2, and Sign(i) := —1 if C(i) < 


1/9. Then Sign computes the sign o; € {+1,—1} of (p; — 1/2) with a high 
probability if the distance of p; from 1/2 is not too small: 


1 
eo | = TQ RK) 


1 
rob(Sign(z) = 0;) > 1 — —~-——-~ for all 7 with 
prob(Sign(i) = 0) >1~ ToERTR 


Now the modified algorithm A = A(i,y) is defined as follows. Let 


Soo | AY) if Sign(i) = +1, 
a { 1 AGE Slane) Sk. 


for 7 € I and y € Y;. Similarly as before, we write 


pi = prob(A(i, fi(x)) = Bi(x) : « — X;) 
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for short. By the definition of A, p; = p; if Sign(i) = +1, and p; = 1— p; 


if Sign(i) = —1. Hence |p; — 1/2] = |p; — 1/2|. Moreover, we have p; > 1/2, 
if Sign(t) = o;, and prob(Sign(i) 4 oj) < 1/(4Q(k) R(k)) if |p; — Ya| = 
1/(4Q(k)R(k))- 
Let ; 
Ik = {i Ik |p 5 = ao}: 


je fien| p 


We compute 


prob(A(i, fi(x)) = Bi(w) si — Ih,e — Xi) - 5 


=F prov()- (a5) 


i€le 
= a prob(i) - (« = 5) + > prob(i) + (« ~ 5) 
=6(0): 


For i ¢ In.%, we have |p; — 1/2| = |p; — /2| < 1/(4Q(k) R(k)) and hence 
— Yo > —1/(4Q(k)R(k)). Thus 


; ~ tL 1 
(1) > Zw: (« 5) ART 
= X prob(sign(i) = 01) prob) - (a - 5) 
t€ 12k 
+ ye prob(Sign(i) 4 o;) - prob(t) - (« ; 5) 1! 
ane : " 2) 4Q(k)R(k) 
=: (2). 


We observed before that p; > 1/2 if Sign(i) = o;, and prob(Sign(i) 4 ai) < 
1/(4Q(k) R(k)) for 2 € Iz, %. Moreover, we obviously have [;,, C Jo, and our 
assumption (6.2) means that 


prob(1; ,) > prob(¢) ab y 


1€lh kb 


for the infinitely many k € K. Therefore, we may continue our computation 
for k € K in the following way: 
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te] ck 
re 1 ) 1 1 1 
7 4Q(k)R(k)} RK) Q(k)  4Q(k)R(k)  4Q(k).R(K) 
1 1 
2Q(k)R(k)  4Q?(k)R?(k) 
1 


= TORO 


Since X is infinite, we obtain a contradiction to statement 1 applied to Aand 
P=4QR, and the proof that statement 1 implies statement 2 is finished. 


As in the preceding proof, we sometimes want to compute probabilities 
with a probabilistic polynomial algorithm, at least approximately. 


Proposition 6.18. Let B(i,x) be a probabilistic polynomial algorithm that 
on input i€ I and x € X; outputs a bit b € {0,1}. Assume that a probability 
distribution is given on X;, for every i © I. Further assume that there is a 
probabilistic polynomial sampling algorithm S which on input 1 € I randomly 
chooses an element x € X; with respect to the distribution given on Xj, 7.€., 
prob(S(z) = x) = prob(). Let P and Q be positive polynomials. 

Let p; := prob(B(i,xz) = 1: a — X;) be the probability of B(i,x) = 1, 
assuming that i is fixed. Then there is a probabilistic polynomial algorithm A 
that approximates the probabilities p;,i € I, with high probability; 1.e., 


1 1 
) i > ; 

pro (140) — vl < poay) = 8 a 
Proof. We first observe that prob(B(i, S(z)) = 1) = p; for every i € I, since 
S(z) samples (by use of its coin tosses) according to the distribution given on 
X;. We define the algorithm A on input i as follows: 

1. Let t be the smallest n € N with n > 1/4- P(|i|)? - Q(t). 

2. Compute B(i, S(i)) t times, and obtain the results 

bi,...,b; € {0,1}. 
3. Let 


, lx 
A(t) := obi. 
i=1 


Applying Corollary B.17 to the t independent computations of the random 
variable B(i, S(i)), we get 


1 
prob (i400 pil < am) Ze At ae Q(t)’ 


as desired. 
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Exercises 


1. Let S be the key generator for 

a. the RSA family, which on input 1” first generates random prime 
numbers p and q of binary length k, and then repeatedly generates 
random exponents e € Z ,n := pq, until it finds an e prime to 
y(n). 

b. the Exp family, which on input 1* first generates a random prime p 
of binary length k, together with the prime factors of p—1, and then 
randomly chooses elements g € Z>, until it finds a primitive root g 
(see the proof of Proposition 6.6). 

Show that S is an admissible key generator for the RSA family and the 
Exp family (see the remark after Proposition 6.8). 


e(n) 


2. Compare the running times of the uniform key generator for RSA keys 
constructed in Proposition 6.8 and the (more efficient) admissible key 
generator S in Exercise 1. 


3. Consider the Definition 6.12 of “strong” one-way functions. If we replace 
the probability statement in condition 3 by “There is a positive polyno- 
mial Q, such that for all probabilistic polynomial algorithms A 


prob(fi(A(i, fla))) = file) 28 KON), 2 Dy) <1 a, 
for sufficiently large k”, then f = (fi)ier is called a family of weak one- 
way functions.” 
Let X; := {2,3,...,2? — 1} be the set of numbers > 1 of binary length 
<j. Let Dy = Uszy Xj X Xn—; (disjoint union), and let Ry := Xp. 
Show that 

f= (fn? Dn — Bn, (29) Ot: Y) nen 


is a weak — not a strong — one-way function, if the factoring assumption 
(Definition 6.9) is true. 


4. We consider the representation problem (Section 4.5.3). 

Let A(p,q,91,---;9r) be a probabilistic polynomial algorithm (r > 2). 
The inputs are primes p and gq, such that q divides p — 1, and ele- 
ments g1,...,gr € ZF of order g. A outputs two tuples (a1,...,@,) and 
(a,..-,@.) of integer exponents. Assume that r is polynomially bounded 
by the binary length of p, i.e., r < T (|p|) for some polynomial T. We de- 
note by Gy the subgroup of order q of Z>. Recall that Gy is cyclic and 
each g € Gy,g #1, is a generator (Lemma A.40). Let P be a positive 
polynomial, and let K be the set of pairs (p,q), such that 


7 A weak one-way function f yields a strong one-way function by the following 
construction: (@1,...,@2kQ(k)) > (fi(r1),-.-, fi(@arqcry)) (where i € Ip). See 
[Luby96] for a proof. 
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prob(A(p, 9, 91,---, Gr) = (ais ex , Ar), (a4, . cuba) 


; al. 
(isi 149 tle) Fes eh ee) Ilo =I a": 


j=l j=l 
93 — Gq\ (13,1 <5 <r) 
> YP ((p))- 


Show that for every positive polynomial Q there is a probabilistic poly- 
nomial algorithm A= A(p,q,g,y), such that 


prob(A(p, q,9,y) = Log, ,(y)) > 1 — 27210), 


for all (p,q) € K, g € Gy \ {1} and y € G,. 


. Let Jp := {n € N | n = pgq,p,q distinct primes, |p| = |q| = k} be the 
set of RSA moduli with security parameter k. For a prime p, we denote 
by Jeg := {n © Je | p does not divide y(n)} the set of those moduli for 
which p may serve as an RSA exponent. Let Primes<2, be the primes of 
binary length < 2k. 
Show that the RSA assumption remains valid if we first choose a prime 
exponent and then a suitable modulus. More precisely, show that the 
classical RSA assumption (Definition 6.7) implies that: 
For every probabilistic polynomial algorithm A and every positive poly- 
nomial P there is a kg € N, such that for all k > ko 
~ nD ~ U . UU UU 1 

prob(A(n, 6, x?) = x | p — Primes<ox,n — Jeg, v — Zh) < Pim 
. Let f = (fi: Di — Ri)ier, I = Ue)een, be a family of one-way func- 
tions with key generator K, and let B = (B;: D; — {0,1})ier be a 
hard-core predicate for f. 
Show that for every positive polynomial P, there is a kg € N such that 
for all k > ko 


a 1 1 
prob(B;(z) =0:i— K(1*),2 “ D;) 5 = PR) 


. Let f = (fi: Di — Ri)ier, I = Ue)een, be a family of one-way func- 
tions with key generator AK, and let B = (B;: D; — {0,1})ier be a 
family of predicates which is computable by a Monte Carlo algorithm. 
Show that B is a hard-core predicate of f if and only if for every proba- 
bilistic polynomial algorithm A(i,x,y) and every positive polynomial P 
there is a ko, such that for all k > ko 


| prob(A(z, fi(x), By(x)) =1:ic K(1*), x = D;) 
— prob(A(i, fi(x),z) =1:i— K(1*),2 — Dy,z = {0,1})| < ae 
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If the functions f; are bijective, then the latter statement means that 
the family ({(fi(a), Bi(x)) : « “ Dy})ier of distributions cannot be 
distinguished from the uniform distributions on (R; x {0,1})ier by a 
probabilistic polynomial algorithm. 


. Let I = ([g)een be a key set with security parameter k. Consider prob- 


abilistic polynomial algorithms A which on input i € J and x € X; com- 
pute a Boolean value A(z, x) € {0,1}. Assume that one family of probabil- 
ity distributions is given on (Ix)xen, and two families of probability distri- 
butions p := (p;)icr and q := (qj)ser are given on X := (X;)je7. Further 
assume that there are probabilistic polynomial sampling algorithms 5} (7) 
and S2(z) which randomly choose an x € X; with prob(S1(¢) = x) = p;(x) 
and prob(S2(¢) = x) = q;(«), for alli € I and x € X;. 

Prove a result that is analogous to the statements of Propositions 6.3 
and 6.17 for 


| prob(A(i,z) =1:ic I, & X;) 
1 


~ prob(A(i,2) =1:i— Ie © Xi) < Bop: 


. Let n := pq, with distinct primes p and q. The quadratic residuosity 


assumption (Definition 6.11) states that it is infeasible to decide whether 
a given x € J+ is a quadratic residue or not. If p and q are kept secret, 
no efficient algorithm for selecting a quadratic non-residue modulo n 
is known. Thus, it might be easier to decide quadratic residuosity if, 
additionally, a random quadratic non-residue with Jacobi symbol 1 is 
revealed. In [GolMic84] (Section 6.2) it is shown that this is not the case. 
More precisely, let I, := {n | n = pq, p,q distinct primes , |p| = |q| = k} 
and I := (Ik)ren. Let QNR,, := Z* \ QR,, be the set of quadratic non- 
residues (see Definition A.48) and QNR;! := QNR,, A J+!. Then the 
following statements are equivalent: 

a. For all probabilistic polynomial algorithms A, with inputs n € I and 
x € J*+ and output in {0,1}, and every positive polynomial P, there 
exists a kg © N such that for all k > ko 

1 


u u 1 

prob(A(n,z) = PQR, (2): n“ ix & Jt) < 5 + Pb’ 

b. For all probabilistic polynomial algorithms A, with inputs n € I and 
z,x € J** and output in {0,1}, and every positive polynomial P, 
there exists a kg € N such that for all k > ko 


prob(A(n, z, x) = PQR,,(z):n“ I,,z @ QNRi', a & Jt) 
1 1 
2 


(k) 


Study [GolMic84] and give a proof. 


7. Bit Security of One-Way Functions 


Let f : X — Y be a bijective one-way function and let « € X. Sometimes 
it is possible to compute some bits of « from f(a) without inverting f. A 
function f does not necessarily hide everything about x, even if f is one way. 
Let b be a bit of x. We call ba secure bit of f if it is as difficult to compute b 
from f(a) as it is to compute x from f(a). We prove that the most-significant 
bit of x is a secure bit of Exp, and that the least-significant bit is a secure 
bit of RSA and Square. 

We show how to compute x from Exp(x), assuming that we can compute 
the most-significant bit of « from Exp(x). Then we show the same for the 
least-significant bit and RSA or Square. First we assume that deterministic 
algorithms are used to compute the most- or least-significant bit. In this much 
easier case, we demonstrate the basic ideas. Then we study the probabilistic 
case. We assume that we can compute the most- or least-significant bit with 
a probability p > 1/2 + 1/p(|2|), for some positive polynomial P, and derive 
that then x can be computed with an overwhelmingly high probability. 

As a consequence, the discrete logarithm assumption implies that the 
most-significant bit is a hard-core predicate for the Exp family. Given the 
RSA or the factoring assumption, the least-significant bit yields a hard-core 
predicate for the RSA or the Square family. 

Bit security is not only of theoretical interest. Bleichenbacher’s 1-Million- 
Chosen-Ciphertext Attack against PKCS#1-based schemes shows that a 
leaking secure bit can lead to dangerous practical attacks (see Section 3.3.3). 

Let n,r € N such that 2"~! < n < 2”. As usual, the binary encoding of 
x € Z,, is the binary encoding in {0,1}" of the representative of « between 
0 and n — 1 as an unsigned number (see Appendix A.2). This defines an 
embedding Z, C {0,1}". Bits of 2 and properties of x that depend on a 
representative of x are defined relative to this embedding. The property “x 
is even”, for example, is such a property. 


7.1 Bit Security of the Exp Family 


Let p be an odd prime number and let g be a primitive root in Z>. We consider 
the discrete exponential function 
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EXDs.g lost A eo 
and its inverse 

LOg 59°: 2p. Ze 
which is the discrete logarithm function. 

Log,,g(x) is even if and only if x is a square (Lemma A.49). The square 
property modulo a prime can be computed efficiently by Euler’s criterion 
(Proposition A.52). Hence, we have an efficient algorithm that computes the 
least-significant bit of x from Exp, (x). The most-significant bit, however, 
is as difficult to compute as the discrete logarithm. 


Definition 7.1. Let p be an odd prime, and let g be a primitive root in Z>: 
1. The predicate Msby is defined by 


0if0<a< %, 


Maby :Zp-1—> (Qt) {ees cs py 


2. The predicate B,,, is defined by 
Bp,g : 25 — {0,1}, Bp.4(z) = Msb,(Log,, ,(z)). 


Remark. If p—1 is a power of 2, then the predicate Msb, is the most-significant 
bit of the binary encoding of «x. 


Let x € QR,,, « 4 1. There is a probabilistic polynomial algorithm which 
computes the square roots of x (Algorithm A.61). Let y := yn... yo, yi € 
{0,1}, be the binary representation of y := Log, ,(x). As we observed before, 
yo = 0. Therefore, w; := g with J := yn... y, is a root of x with By,g(wi) = 
0. The other root wz is gJ+-/?, and obviously B,,,(w2) = 1. Thus, exactly 
one of the two square roots w satisfies the condition B,,,(w) = 0. 


Definition 7.2. Let x € QR,,. The square root w of x which satisfies the 
condition B,,,(w) = 0 is called the principal square root of x. The map 
PSart,,, is defined by 


PSart,,, : QR, —> Z, x +— principal square root of z. 


Remark. The ability to compute the principal square root in polynomial time 
is equivalent to the ability to compute the predicate B,,, in polynomial time. 
Namely, let A be a deterministic polynomial algorithm for the computation 
of PSart,,,. Then the algorithm 


Algorithm 7.3. 
int B(int p,g, 2) 
1 if A(p,9,27) =x 
2 then return 0 
3 else return 1 


7.1 Bit Security of the Exp Family 177 


computes B, 4 and is deterministic polynomial. 

Conversely, let B be a deterministic polynomial algorithm which computes 
Bp,g- If Sqrt is a polynomial algorithm for the computation of square roots 
modulo p, then the polynomial algorithm 


Algorithm 7.4. 
int A(int p, g,x) 
1 {u,v} — Sqrt(a, p) 
2 if Bip,g,u) =0 
3 then return u 
4 else return uv 


computes PSart,, ,. It is deterministic if Sqrt is deterministic. We have a poly- 
nomial algorithm Sqrt (Algorithm A.61). It is deterministic for p= 3 mod 4. 
In the other case, the only non-deterministic step is to find a quadratic non- 
residue modulo p, which is an easy task. If we select t numbers in {1,p — 1} 
at random, then the probability of finding a quadratic non-residue is 1 — 1/9¢. 
Thus, the probability of success of A can be made almost 1, independent of 
the size of the input x. 


In the following proposition and theorem, we show how to reduce the com- 
putation of the discrete logarithm function to the computation of PSart,, 4. 
Given an algorithm A, for computing PSqrt,,, we develop an algorithm A» 
for the discrete logarithm which calls A; as a subroutine.' The resulting algo- 
rithm A» has the same complexity as A;. Therefore, PSqrt,, , is also believed 
to be not efficiently computable. 


Proposition 7.5. Let A; be a deterministic polynomial algorithm, such that 
Ai(p,9,£) = PSart, ,(z) for all x € QR, 


with p an odd prime and g a primitive root in Z,. Then there is a deterministic 
polynomial algorithm Ag, such that 


Ao(p, 9,2) = Log, ,(x) for all x € QR,. 


Proof. The basic idea of the reduction is the following: 

1. Let x= 9" € Zi and y= yx--- Yo, yi € {0,1}, 1 =0,...,k, be the binary 
encoding of y. We compute the bits of y from right to left. Bit yo is 0 if 
and only if 2 € QR, (Lemma A.49). This condition can be tested by use 
of Euler’s criterion for quadratic residuosity (Proposition A.52). 

2. To get the next bit y,, we replace x by rg~t = g¥*-9 if yg = 1. Then 
bit y1 can be obtained from PSart, ,(2) = g¥*" as in step 1. 

The following algorithm A», which calls A; as a subroutine, computes Log, . 


' Tn our construction we use A; as an “oracle” for PSqrt Therefore, algorithms 


such as A; are sometimes called “oracle algorithms” . 


Pg" 
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Algorithm 7.6. 
int Ag(int p,g, 2) 
1 y<empty word, k < |p| 
2 forc-—O0tok—1do 


3 ifr € QR, 

4 then y <— yO 
5 else yo yll 
6 xe-axg! 
7 x — Ai(p, 9,2) 

8 return y 


This completes the proof. 


Theorem 7.7. Let P,Q € Z[X] be positive polynomials and A, be a proba- 
bilistic polynomial algorithm, such that 


1 


* PUR’ 


1 
prob(Ai(p, g,«) = PSart,, g(a) :@ — QR,,) > 5 
where p is an odd prime number, g is a primitive root in ZS and k = |p| is 
the binary length of p. Then there is a probabilistic polynomial algorithm Ao, 


such that for every x € Zz, 
prob(Aa(p, g, 2) = Log, 4(x)) > 1-272. 


Proof. Let ¢ := 1/P(k)- In order to reduce Log,,, to PSaqrt,,,, we intend to 
proceed as in the deterministic case (Proposition 7.5). There, A; is applied 
k times by Ag. Ag correctly yields the desired logarithm if PSqrt is correctly 
computed by A, in each step. Now the algorithm A, computes the function 


PSaqrt,,, with a probability of success of only > 1/9+¢. Thus, the probability 
of success of Ag is > (1/2+ e)*. This value is exponentially close to 0, and 
hence is too small. 

The basic idea now is to replace A; by an algorithm B which computes 
PSart,,,(z) with a high probability of success, for a polynomial fraction of 
inputs. 


Lemma 7.8. Under the assumptions of the theorem, let t := ke~?. Then 
there is a probabilistic polynomial algorithm B, such that 


1 -—1 
prob( B(x) = PSart, ,(#)) > 1— R? forxz= 97% ,0<s< = ‘ 
Proof (of the Lemma). Let x = g?°,0 < s < (P — 1)/94. In our algorithm B we 
want to increase the probability of success on input 2 by computing A(z) 


repeatedly. By assumption, we have 


u 1 
prob(Ai(p, 9,2) = PSqrty,o(x) 7“ QR,) > 5 +6. 
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Here the probability is also taken over the random choice of « € QR,,. There- 

fore, we must modify x randomly each time we apply Aj. For this pur- 

(p=1) _ 1 
2 


pose, we iteratively select r € {o, #25 \ at random and compute 


Ai(p,g,7g""). If Ai(p,g,2g°") successfully computes PSqrt,, .(xg?"), then 
PSart,, g(t) = A1(p,g, 29°") - 97", at least if s+r < (p — 1)/5. The latter 
happens with a high probability, since s is assumed to be small. Since the 
points xg?” are sampled randomly and independently, we can then compute 
the principal square root PSqrt, ,(x) with a high probability, using a ma- 
jority decision on the values A;(p,g,x7g?") -g~". The probability of success 
increases linearly with the number of sampled points, and it can be computed 
by Corollary B.18, which is a consequence of the weak law of large numbers. 


Algorithm 7.9. 
int B(int x) 
1 Co bo 0, Ch <-— 0 
2 {u,v} <— Sart(z) 
3 fori 1totdo 
4 select r € {0,...,25*—1} at random 
5 if Ai(p,g,x7g?") = ug” 
6 then Co — Co +1 
7 
8 
9 
10 


else Cy —C, +1 
if Cop > Cy 
then return u 
else return v 


We now show that 


1 -1 
prob( B(x) = PSart, (2) = 1- ra ; for 7 = 9°?,0 <s< Ta ; 


Let r; be the randomly selected elements r and x; = xg?",i =1,...,t. Every 
element z € QR,, has a unique representation z = xg", 0 <r < (p — 1)/o. 
Therefore, the uniform and random choice of r; implies the uniform and 
random choice of x;. Let x = g?°,0 < s < (P—1)/4, and r < (t— 1)(P— fy. 


Then , ; i 
t 
2s+ ar < A ( Mp Vis 1, 


and hence 
PSart.,,,(x) +g" = PSart,, (xg?) . 


Let E, be the event r < (¢—1)(p — 1)/y and E> be the event Aj(p,g,7g?") = 
PSart,,g(ag?"). We have prob(E£;) > 1— 14 and prob(E2) > 1/2 + ¢, and we 
correctly compute PSqrt,, (x) -g", if both events EZ, and Ey occur. Thus, we 
have (denoting by E2 the complement of event E2) 


prob(Ai(p, 9,7g"") = PSart,, 4(2) - 9”) 
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> prob(£; and E>) > prob( £1) — prob( £2) 


In each of the ¢ iterations of B, PSqrt,, (x) - g” is computed with prob- 
ability > 1/9. Taking the most frequent result, we get PSqrt,, (a) with a 
very high probability. More precisely, we can apply Corollary B.18 to the 
independent random variables $;, 7 =1,...,t, defined by 


es 1 if Ai(p, 9,292") = PSqrt, g(x) - 9", 
: 0 otherwise. 


We have E(S;) = prob($; =1) > VWo+e. 
If u= PSart,, ,(x), then we conclude by Corollary B.18 that 


4 1 


t 
prob( B(x) = PSart,, 4(x)) = prob (c > 5) >1 7 1 = 


The case v = PSqrt, (x) follows analogously. This completes the proof of 
the lemma. 


We continue with the proof of the theorem. The following algorithm com- 
putes the discrete logarithm. 


Algorithm 7.10. 
int A(int p, g, x) 

1 y<empty word, k < |p|,t — ke7 
2 guess j € {0,...,¢—1} satisfying 7+ < Log, ,(%) < G+ 1)2> 
30 n= ag e-D/A 
4 forc«1tokdo 
5 = if E QR, 
6 then y <— y|0 
7 
8 
9 


2 


else y< yl 
Le rg 
x B(x) 
O return y+ 222 | 


1 


eR 


In algorithm A we use “to guess” as a basic operation. To guess the right 
alternative means to find out the right choice by computation. 

Here, to guess the correct 7 means to carry out lines 3-9 and then test 
whether y + 222 | is equal to Log, ,(a), for 7 = 0,1,.... The test is done 
by modular exponentiation. We stop if the test succeeds, i.e., if Log, ,(x) is 
computed. We have to consider at most t = ke~? = kP?(k) many intervals. 
Hence, in this way we get a polynomial algorithm. This notion of “to guess” 
will also be used in the subsequent sections. 
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We have A(p,g,2) = Log, (x) if B correctly computes PSqrt,, (x) in 
each iteration of the for loop. Thus, we get 


NG 
prob(A(p. 42) = Log, 9(t)) > (1~ 
As (1 — 1/)* increases monotonously (it converges to e~'), this implies 
\ 7 fe 
prob(A(p, g, x) _ Log,,g(x)) 2 1— 9 — 4° 


By Proposition 5.7, we get, by repeating the computation A(p, g, x) indepen- 
dently, a probabilistic polynomial algorithm A2(p, g,x) with 


prob(Aa(p, g, 2) = Log, (x)) > 1-272. 


This concludes the proof of the theorem. 


Remarks: 


1. The expectation is that A will compute Log, ,(x) after four repetitions 
(see Lemma B.12). Thus, we expect that we have to call A; at most 
4k°e~* times to compute Log, , (2). 

2. In [BluMic84], Blum and Micali introduced the idea to reduce the discrete 
logarithm problem to the problem of computing principal square roots. 
They developed the techniques we used to prove Theorem 7.7. In that 
paper they also constructed cryptographically strong pseudo-random bit 
generators using hard-core bits. They proved that the most-significant bit 
of the discrete logarithm is unpredictable and achieved as an application 
the discrete exponential generator (see Section 8.1). 


Corollary 7.11. Let I = {(p,g) | p prime,g € Z;, a primitive root}. 
Provided the discrete logarithm assumption is true, 


0if0<2<% 
Msb := (Misbp :Zp-1 — {0,1}. { Spell a 
ig =e se} (p,g)eI 


is a family of hard-core predicates for the Exp family 
Exp = (Exp, , : Zp_1 —> Zh, x+— g? mod P)(p,g)el « 


Proof. Assume Msb is not a family of hard-core predicates for Exp. Then, 
there is a positive polynomial P € Z[X] and an algorithm Aj, such that for 
infinitely many k 


prob(Ai(p, 9,9") = Msbp(x) : (p, 9) — Ik, 2 — Zp_1) > at Pir)’ 


By Proposition 6.17, there are positive polynomials Q, R, such that 
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prob {iv g) € Tk | prob(Ai(p, 9,9") = Msbp(2) : « = Zp-1) 


-+dy))>a 
2° Q(k) Rk)” 
for infinitely many k. From the theorem and the remark after Definition 7.2 


above, we conclude that there is an algorithm A» and a positive polynomial 
S, such that 


prob (fo. g) € Ik | prob( Aa(p, 9,9") =a:8* Zp 1) >1- sm}) 
1 


R(k) 


for infinitely many k. By Proposition 6.3, there is a positive polynomial T 
such that 


> 


1 


prob(A2(p, 9, 9°) = @: (p,g) — Ik,  Zp_1) > Tk) 


for infinitely many k, a contradiction to the discrete logarithm assumption 
(Definition 6.1). 


Remark. Suppose that p — 1 = 2'a, where a is odd. The ¢ least significant 
bits of x can be easily computed from g* = Exp, ,(z) (see Exercise 3 of 
this chapter). But all the other bits of 2 are secure bits, i-e., each of them 
yields a hard-core predicate, as shown by Hastad and Naslund ([HasNas98}; 
[HasNas99]). 
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Let Lsb(a) := 2 mod 2 be the least-significant bit of « € Z,x > 0, with 
respect to the binary encoding of x as an unsigned number. In order to 
compute Lsb(a) for an element x in Z,, we apply Lsb to the representative 
of x between 0 and n— 1. 


In this section, we study the RSA function 
RSAn ¢ : Zi — Zt, xr xf 


n? 


and its inverse RSA, for n = pq, with p and q odd, distinct primes, and e 
prime to y(n). . 

To compute Lsb(a) from y = 2° is as difficult as to compute x from y. 
The following Proposition 7.12 and Theorem 7.14 make this statement more 
precise. The proofs show how to reduce the computation of x from y to the 
computation of Lsb(x) from y. 

First, we study the deterministic case where Lsb(z) can be computed from 
y by a deterministic algorithm in polynomial time. 
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Proposition 7.12. Let A, be a deterministic polynomial algorithm, such that 
Aji(n,e,x°) = Lsb(x) for all x € Z , 


where n := pq, p and q are odd distinct prime numbers, and e is relatively 
prime to p(n). Then there is a deterministic polynomial algorithm Az, such 
that 

Ao(n,e,x°) = x for alla € Z. 


Proof. Let « € Z* and y = x°. The basic idea of the inversion algorithm is 
to compute a € Z* and a rational number u € Q,0 <u < 1, with 


1 
jax mod n —un| < 5. 


Then we have az mod n = |un + 1/9], and hence x = a~! |un + 1/2| mod n. 
This method to invert the RSA function is called rational approximation. We 
approximate ax mod n by the rational number un. 

For z € Z, let Z:= z mod n. Let 27! denote the inverse element of 2 mod n 
in Z*. We start with up = 0 and ap = | to get an approximation for Gor with 


|age — ugn| <n. 
We define 
vata 
at s= 2 Qt—-1 and 


1 
Ut = gy (ut1 + Lsb(@_72)) 
(the last computation is done in Q). In each step, we replace a;_, by a; and 
ut—1 by uz, and we observe that 


Q@—1£ if @_1@ is even, 


Gee = 2-!a:_jx = 


Nir NIF 


(G1 + n) if G7 is odd, 


and hence 


1 
|aqx — uyn| = 2 |a@;—1E — Uy_1 0]. 


After r = |n| +1 steps, we reach 


Nl rR 


n 
|G,-e — upn| < as 


2 
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Since Lsb(@;r) = Aj(n,e,afy mod n), we can decide whether @% is even 
without knowing x. Thus, we can compute a; and u; in each step, and finally 
get x. The following algorithm inverts the RSA function. 


Algorithm 7.13. 
int Ag(int n,e, y) 
1 aelu—O0,k < |n| 
2 fort<—0 to kdo 
3 u< $(ut Ai(n,e,a°y mod n)) 
4 a~2-'amodn 
5 return a! |un+3|modn 


This completes the proof of the Proposition. 


Next we study the probabilistic case. Now the algorithm A, does not 
compute the predicate Lsb(a) deterministically, but only with a probability 
slightly better than guessing it at random. Nevertheless, RSA can be inverted 
with a high probability. 


Theorem 7.14. Let P,Q € Z[X] be positive polynomials and A, be a prob- 
abilistic polynomial algorithm, such that 


u 1 1 
prob(Ai(n, e, x°) = Lsb(x) :  — Z*) > 5 Pk)’ 
where n := pq, k := |n|, p and q are odd distinct primes and e is relatively 


prime to y(n). Then there is a probabilistic polynomial algorithm Az, such 
that 
prob(Ao(n,e, 2°) = x) >1—27@) for all x € Zi. 


Proof. Let y := x© and let ¢ := 1/P(k)- As in the deterministic case, we use 
rational approximation to invert the RSA function. We try to approximate 
ax mod n by a rational number un. To invert RSA correctly, we have to 
compute Lsb(az) correctly in each step. However, now we only know that 
the probability for k correct computations of Lsb is > (1/2 + ¢)*, which is 
exponentially close to 0 and thus too small. In order to increase the proba- 
bility of success, we develop the algorithm L. The probability of success of L 
is sufficiently high. This is the statement of the following lemma. 


Lemma 7.15. Under the assumptions of the theorem, there is a probabilistic 
polynomial algorithm L with the following properties: given y := x°, randomly 
chosen a,b € Z*,? a := Lsb(ax mod n), 8 := Lsb(bx mod n), u € Q with 
jax mod n — un| < eng and v € Q with |bx mod n — un| < &%, then L 
successively computes values ,,t =0,1,2,...,k, such that 


2 Actually we randomly choose a,b € Zn. In the rare case that a,b ¢ Zi, we can 
factor n using Euclid’s algorithm and compute x from 2°. 
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prob(i; = Lsb(a,;2 mod n)| Np l; = Lsb(a;z mod n): a,b — Z*) > 1- — 


where ao := a, a4 = 27ta4_1. 


Proof (of the Lemma). Let m := min(2‘'e~?,2ke~?). We may assume that 
both primes p and q are > m. Namely, if one of the primes is < m, then 
we may factorize n in polynomial time (and then easily compute the inverse 
of the RSA function), simply by checking whether one of the polynomially 
many numbers < m divide n. 

To compute Lsb(a;x mod n), LE will apply A, m times. In each step, a 
least-significant bit is computed and the return value of L is the more fre- 
quently occurring bit. In the theorem, we assume that 


prob(A;(n, e, 2°) = Lsb(x) : x “ Z*) > ; +€, 

The probability is also taken over the random choice of « € Z*. Thus, we 
cannot simply repeat the execution of A, with the same input (a;x)°, but 
have to modify the input randomly. We use the modifiers a; + ta:-1 + 8, 
i € Z,—™M/g <i <™/2—1, and compute Lsb((a; + ia;_1 + b)@ mod n).° As 
we will see below, the assumptions of the lemma guarantee that we can infer 
Lsb(a;x mod n) from Lsb((a;+ia,-1+b)2 mod n) with high probability (here 
sufficiently good rational approximations u; of a,x mod n and v of bx mod n 
are needed). a and b are chosen independently and at random, because then 
the modifiers a; + 1a,-1 + 6 are pairwise independent. Then Corollary B.18, 
which is a consequence of the weak law of large numbers, applies. This implies 
that the probability of success of D is as large as desired. 

We now describe how L works on input of y = x°, a,b, a, 8, u and v to com- 
pute l,,t=0,1,...,k. In its computation, D uses the variable a; to store the 
current a,, and the variable a;_; to store the a; from the preceding iteration. 
Analogously, we use variables u,; and uz_1. uzn is a rational approximation 
of ayx. We have ug = u and uz = 1/2 (ut_1 + Ly_-1). 

It is the goal of LZ to return |, = Lsb(a,a mod n) for t = 0,1,...,k. The 
first iteration t = 0 is easy: Ip is the given a. From now on, the variable a is 
used to store the last computed [;. 

Before L starts to compute J,,l2,...,%, its variables are initialized by 


At— 1 ‘= Ao = G, Ut_-1 3= U. 


To compute I;,t > 1, LZ repeats the following subroutine. 


3 Tf ap tiat_i +b ¢ Zr, we factor n using Euclid’s algorithm and compute x from 
x, 
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Algorithm 7.16. 
i) 
Co — 0;C, — 0 
ay — 27 aya; us $(ue-1 +a) 
fori — —F to }—1do 
A—aztia_1+6 
W <— [ue + iuz_1 +0] 
B= (ia+ 8+ W) mod 2 
if Ai(n,e, A°y mod n) @ B=0 
then Co —Co+1 
else Cy —C,+1 
Ut—1 — Ut, At—1 — At 
if Co > Cy 
then a — 0 
else a1 
return @ 


Perr 
FPwWNrF OO AOAN DTK WNH FH 


For z € Z, we denote by Z the remainder of z modulo n. 
For i € {—™/g,...,™/2 — 1}, let 

Api = at + iay_1 + 8, 

Wi = Up t iui t+0,Wii = [Wiil, 

Bry := (i- Lsb(@=72) + Lsb(br) + Lsb(W;,;)) mod 2. 


We want to compute Lsb(@) from Lsb(A;,;2), Lsb(@—7@) and Lsb(ba). 


For this purpose, let Ay; = @X+i-Gq_1F+ba = qn+ A, yx with q = |Ati/n|. 
Then Lsb(\;,;) = (Lsb(@a) + i- Lsb(@—72) + Lsb(bx)) mod 2 and 


Lsb(Az i) => (Lsb(Az, 
= (Lsb(@z 


+ Lsb(q)) mod 2 
| ¢- Lsb(@—12) + Lsb(bx) + Lsb(q)) mod 2, 


) I 
) 4 
and we obtain 


Lsb(@) = (Lsb(A;;2) + i- Lsb(@—7@) + Lsb(bz) + Lsb(q)) mod 2. 


The problem is to get q and Lsb(q). We will show that W;,; is equal to 
q with a high probability, and W;,; is easily computed from the rational 
approximations u, of @, uz—1 of @—7# and v of ba. If Wi; = g, we have 


Lsb(@x) = Lsb(Azg ix) © Bri. 


We assume from now on that L computed the least-significant bit correctly 
in the preceding steps: 


Lsb(@az) =1,,0<j <t—-1. 


Next, we give a lower bound for the probability that W;,, = q. Let Z = 
levi Wy nl. 
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Z = |aqe — mn + i(G—1e — u_1n) + bx — vn 
1 of 
< 5 (Ge=1t — win) (1 + 22) + |ba — vn| 
ne € e fem € 
S og lt + il ars <n ( a | 1) Sh 
Note that |ajv — ujn| = 1/o|\(@Goav — uj;-in)| for 1 < j < t under our 


assumption 1; = Lsb(@jx),0 < 7 < t—1 (see the proof of Proposition 
7.12). Moreover, |1 + 2i| < m, because —™/g <i < ™/g—1, and m = 
min(2'e~?, 2ke~?). 

Now W;,, # q if and only if there is a multiple of n between \;,; and Wj ,n. 
There is no multiple of n between A;,,; and W/ ,n if 


€ — € 
z” < Ani = Arie <n— a 
because Z < €/4n. 
Since a,b € Z* are selected uniformly and at random, the remainders 


Ati = (Ge + iae_1 +b)x mod n = ((271+%)az_-1 + bx) mod n are also selected 
uniformly and at random. 


This implies 


prob(W;,; = q) > prob (Gn < Apa <n— =n) >1- 


Nl] 


We are now ready to show 


1 

prob(l, = Lsb(a@x)| AS—5 1; = Lsb(ajx) }) > 1 — Te 

Let F1,; be the event Ai(n,e, Af »y) = Lsb(Az,:2) (here recall that y = °), 
and let EF; be the event that A;;2 satisfies the condition 


E =——— E 
—n < Ayix << n—-—n. 
4 ue 4 


We have prob(£1,;) > 1/2 + and prob(E2,;) = 1 — &/2. We define random 


variables 
e= lif EY i and E> j occur, 
“| 0 otherwise. 


We do not err in computing Lsb(a;2) in the i-th step of our algorithm L if 
both events Ey; and E2,; occur, i.e., if S; = 1. 
We have (denoting by £1; the complement of event EF} ,;) 


prob(S; = 1) = prob(£,, and E2;) > prob(E2;) — prob(£ ;) 


aa ae 
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Let i £ j. The probabilities prob(.S; = d) and prob(S; = d) (d € {0,1}) are 
taken over the random choice of a, b € Z;, and the coin tosses of Ai (n, e, Af ;y) 


and A;(n,e, Af jy): The elements ag = a and b are chosen independently 
(and uniformly), and we have (Az,;, At,j) = (ar—-1, 6) A = (2-144, b)A with 


the invertible matrix : ‘ 
fare 2 +g 
an (7) 


over Z*. Thus, A;,, and A; are also independent. This implies that the 
events Hz; and K2,; are independent and that the inputs Ar iy and At jy are 
independent random elements. Since the coin tosses during an execution of A 
are independent of all other random events (see Chapter 5), the events Fy; 
and £,; are also independent. We see that 5S; and S; are indeed independent. 
Note that the determinant 7 —j of A is in Z*, since |i — j| < m < min{p, g}. 

The number of i, —™/2 < i < ™/g — 1, and hence the number of random 
variables S;, is m. By Corollary B.18, we conclude 


m 1 1 
pro (Sosi> B) > 4 aE: 


Recall that we do not err in computing Lsb(aq;x) in the «th step of our 
algorithm L, if both events Fy, and E2,; occur, ie., if S; = 1. Thus, we 
have Co > >>, 9; and hence prob(Co > C)) > 1 — !/ox if Lsb(@z) = 0, and 
C, > >>, S; and hence prob(C, > Co) > 1— 1/2x if Lsb(@zx) = 1. Therefore, 
we have shown that 


1 
prob(l; = Lsb(@z)| Aes, l, = Lsb(@az)) > 1- 5K 


The proof of the lemma is complete. 


We continue in the proof of the theorem. The following algorithm A in- 
verts the RSA function by the method of rational approximation. The basic 
structure of A is the same as that of Algorithm 7.13. Now we call L to com- 
pute Lsb(ax). Therefore, we must meet the assumptions of Lemma 7.15. This 
is done in lines 1-4. 
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Algorithm 7.17. 
int A(int n, e, y) 
1 select a,b € Z* at random 
2 guess u,v € QN (0,1[ satisfying 
3 jax mod n — un| < 8, |ba mod n— vn| < & 
4 guess a — Lsb(az mod n), guess 3 — Lsb(ba mod n) 
5 Compute lo,11,...,l, by L 
6 fort+0 to kdo 
7 uc s(utk) 
8 a 2-'amodn 
9 return a! |un + 5 | mod n 


Algorithm LI computes Io,...,/, in advance. Lines 7 and 8 of A also ap- 
pear in L. In a real and efficient implementation, it is possible to avoid this 
redundancy. 

As above in Algorithm 7.10, we can “guess” the right alternative. This 
means we can find out the right alternative in polynomial time. There are 
only a polynomial number of alternatives, and both the computation for each 
alternative as well as checking the result can be done in polynomial time. In 
order to guess u or v, we have to consider 8/-3 = 8P(k)® and 8/- = 8P(k) 
many intervals. There are only two alternatives for Lsb(@) and Lsb(bz). 

A(n,e,y) = RSA; L(y) = & for y = x° if L correctly computes I; = 
Lsb(@2) for t=1,...,k. Thus, we have 


prob(A(n, €,y) = 2) = (1 - i) ) 


Since (1 — 1/2)" increases monotonously (converging to e~!/2), we conclude 
prob(A(n,e,y) =x) > =. 


Repeating the computation A(n,e, y) independently, we get, by Proposi- 
tion 5.7, a probabilistic polynomial algorithm Aog(n, e, y) with 


prob(Aa(n, e,y) = RSA; e(y)) = 1-279, 


and Theorem 7.14 is proven. 
Remarks: 


1. The expectation is that A will compute RSA; é (y) after two repetitions 
(see Lemma B.12). The input of A; does not depend on the guessed 
elements u,v,q@ and @ (it only depends on a and b). Thus, we can also 
use the return values of A;, computed for the first guess of u,v,a and 
G, for all subsequent guesses. Then we expect that we have to call A; at 
most 4k?e~? times to compute RSA; £(y). 
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2. The bit security of the RSA family was first studied in [GolMicTon82], 

in which a method for inverting RSA by guessing the least-significant bit 
was introduced (see Exercise 11). 
The problem of inverting RSA, if the least-significant bit is predicted only 
with probability > 1/2 + !/px), is studied in [SchnAle84], [VazVaz84], 
[AleChoGolSch88] and [FisSch2000]. The technique we used to prove 
Theorem 7.14 is from [FisSch2000]. 


Corollary 7.18. Let 
I := {(n,e) | n= pq, p and q odd distinct primes, |p| = |q|, e prime to y(n)}. 
Provided the RSA assumption is true, then 
Lsb = (Lsby,¢ : Z7, — {0,1}, «+> Lsb(z)) (meyer 
is a family of hard-core predicates for the RSA family 
RSA = (RSAn < : ZX —> ZF, 2+ 2°) (ne)er- 


n? 


Proof. The proof is analogous to the proof of Corollary 7.11. 


Remark. Hastad and Naslund have shown that all the plaintext bits are secure 
bits of the RSA function, i.e., each of them yields a hard-core predicate 
([HasNas98]; [HasNas99]). 


7.3 Bit Security of the Square Family 


Let n := pq, with p and q distinct primes, and p,q = 3 mod 4. We consider 
the (bijective) modular squaring function 


Square,, : QR,, — QR,,, «> x? 


and its inverse, the modular square root function 


Sqrt, : QR, — QR,,, y +> Sart, (y) 


(see Section 6.5). The computation of Sqrt,,(y) can be reduced to the com- 
putation of the least-significant bit Lsb(Sqrt,,(y)) of the square root. This is 
shown in Proposition 7.19 and Theorem 7.22. In the proposition, the algo- 
rithm that computes the least-significant bit is assumed to be deterministic 
polynomial. Then the algorithm which we obtain by the reduction is also 
deterministic polynomial. It computes Sqrt,,(y) for all y € QR,, (under the 
additional assumption n = 1 mod 8). 


Proposition 7.19. Let A; be a deterministic polynomial algorithm, such that 
Ai (n, 2”) = Lsb(z) for all « € QR,, , 


where n = pq, p and q are distinct primes, p,q = 3mod4 andn= 1 mod 8. 
Then there exists a deterministic polynomial algorithm Ag, such that 


Ag(n,y) = Sart, (y) for all y € QR, . 
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Proof. As in the RSA case, we use rational approximation to invert the 
Square function. Let y = 27,2 € QR,,. Since n is assumed to be = 1 mod 8, 
either 2 € QR,, or —2 € QR, (see the remark following this proof). 

First, let 2 € QR,. Then 2~' € QR,,. We define a9 := 1, a := 
2-1a,_, mod n for t > 1, and ug := 1, wy := 1/2 (up_1 + Lsb(ay_1x mod n)) 
for t > 1. Since 27! € QR,,, we have a; € QR,, and hence a;z € QR, 
for all ¢ > 1. Thus, Sqrt,,(a?x?) = ax, and hence we can compute 
Lsb(a;z mod n) = Aj(n,a?x?) = Ai(n,a?y) by Ai for all t > 1. The ra- 
tional approximation works as in the RSA case. 

Now, let —2 € QR,,. Then —2~' € QR,,. We modify the method of 
rational approximation and define ag = 1, a, = —2~'a,_1 mod n for t > 1, 
and up = 1, uw, = 1/2 (2 — Lsb(ay_12 mod n) — u;_1) for t > 1. Then, we get 


\(az-1% mod n — w_1n)|, 


1 
|a.z mod n — uzn| = 5 


because 


az mod n = —27*a,_12 mod n = 27'(n — ay_1@ mod n) 


After r = |n| + 1 steps we reach 


(n — a4_1x2 mod n) if a,_1x mod n is odd, 


I 


NiIF NIR 


(n — a¢-1% mod n+n) otherwise. 


| I< n e 1 
Apx — Upn| < — <=. 

27 2 
Since —2-! € QR,,, we have a, € QR,, and hence ax € QR,, for ¢t > 1. 
Thus, Sqrt,,(a?z”) = a,x, and hence we can compute Lsb(a;z mod n) = 
Ai(n, azz?) = Ai(n,a?y) by A; for all t > 1. 


Remarks: 


1. As p,q = 3 mod 4 is assumed, p and q are = 3 mod 8 or = —1 mod 8, 
and hence either n = 1 mod 8 or n = 5 mod 8. We do not consider the 
case n = 5 mod 8 in the proposition. 

2. The proof actually works if we have 2 € QR, or —2 € QR,,. This is 
equivalent to n = 1mod 8, as follows from Theorem A.53. We have 
2 € QR,, if and only if 2 € QR, and 2 € QR,, and this in turn is 
equivalent to p= q = —1 mod 8 (by Theorem A.53). On the other hand, 
—2 € QR,, if and only if —-2 € QR, and —2 € QR,, and this in turn is 
equivalent to p= q= 3 mod 8 (by Theorem A.53). 


Studying the Square function, we face, in the probabilistic case, an addi- 
tional difficulty compared to the reduction in the RSA case. Membership in 
the domain of the Square function — the set QR, of quadratic residues — is not 
efficiently decidable without knowing the factorization of n. To overcome this 


192 7. Bit Security of One-Way Functions 


difficulty, we develop a probabilistic polynomial reduction of the quadratic 
residuosity property to the predicate defined by the least-significant bit of 
Sqrt. Then the same reduction as in the RSA case also works for the Sqrt 
function. 


Let Jt! = {a € Z* | (£) = +1}. The predicate 


1 if x € QR,, 


PQR,, : Jt? — {0,1}, PQR,,(z) = 

0 otherwise, 
is believed to be a trapdoor predicate (see Definition 6.11). In Proposition 
7.20, we show how to reduce the computation of PQR,, (a) to the computation 
of the least-significant bit of a Sqrt,, (x). 


Proposition 7.20. Let P,Q € Z[X] be positive polynomials, and let A, be a 
probabilistic polynomial algorithm, such that 


eR 


1 


prob(A1(n, 2”) = Lsb(x) : z “ QR,,) > ot PO: 


where n = pq, p and q are distinct primes, p,q = 3 mod 4, and k = |n|. 
Then there exists a probabilistic polynomial algorithm Az, such that 


prob(Ag(n, x) = PQR,,(x)) > 1- am for alla € Ji". 


Proof. Let « € Jx*. If « € QR,, then x = Sart,,(a?) and therefore 
Lsb(ax) = Lsb(Sart,,(#?)). If « € QR,,, then —z mod n = n — x = Sart,, (x?) 
and Lsb(x) 4 Lsb(Sqrt,,(x?)) (note that —1 € J1\ QR,, for p,q = 3 mod 4, 
by Theorem A.53). Consequently, we get 

PQR,, (x) = Lsb(x) @ Lsb(Sqrt,,(a?)) @ 1. 


Since for each y € QR, there are exactly two elements x € J! with 
x? = y and because |J;*1| = 2|QR,,|, we conclude 
prob(A1(n, x”) = Lsb(Sqrt,,(a7)) : 2 “& Jt") 
= prob(Ai(n, y) = Lsb(Sart,,(y)) : y = QR,,). 


Hence, we get 


u 1 1 
prob(PQR,,(x) = Ai(n, 2”) @ Lsb(x) 1:24 Jt") > 5 + Pim)’ 


We construct an algorithm Ag, such that for every x € Jt! 


1 
prob(Ag(n, x) = PQR,,(x)) > 1—- alton 
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Algorithm 7.21. 
int Ag(int n, 2x) 
Tt ¢e Ol Pe) 


fori 1 toldo 
select r € QR,, at random 
e—c+t (A;(n, (rx)? mod n) 6 Lsb(rz mod n) 6 1) 
ifc> f 
then return 1 
else return 0 


NOOR WD 


Let x € J**. PQR, (x) = PQR,,(rx) is computed | times by applying 
A, to the / independent random inputs rz. We compute PQR,,(ra) in each 
step with a probability > 1/9. The weak law of large numbers guarantees 
that, for sufficiently large 1, we can compute PQR,,(2) by the majority of the 
results, with a high probability. More precisely, let the random variable Sj, 
i=1,...,1, be defined by 


i= 


1 if PQR,, (rz) = Ai(n, (rx)”) © Lsb(rz) @ 1, 
0 otherwise, 
with r € QR,, randomly chosen, as in the algorithm. Then we have 


prob(S; = 1) > 1/2 + !/prx). The random variables $;, i = 1,...,1, are 
independent. If PQR,,, (a) = 1, we get by Corollary B.18 


l P2(k) 1 
> >1 . 
prob(c > 5) >1 oe Oh) 
The case PQR,, (x) = 0 follows analogously. Thus, we have shown 
1 
prob(Ag(n, x) = PQR,,(z)) > 1- ——., 


as desired. 


Theorem 7.22. Let P,Q © Z[X] be positive polynomials and A, be a prob- 
abilistic polynomial algorithm, such that 


u 1 
prob(Ai(n, 2?) =Lsb(e): ©“ QR,) 2 5+ Gy: 


where n := pq, p and q are distinct primes, p,q = 3 mod 4, and k := |n|. 
Then there is a probabilistic polynomial algorithm Ag, such that 


prob(Ag(n,x) = Sqrt,,(x)) > 1—27@ for all « € QR,,. 
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Proof. The proof runs in the same way as the proof of Theorem 7.14. We only 
describe the differences to this proof. Here, the algorithm A, is only applicable 
to quadratic residues. However, it is easy to compute (4) for x € Z*, and we 
can use algorithm A», from Proposition 7.20 to check whether a given element 
x € J+ is a quadratic residue. Assume that prob(A2(n, 2) = PQR,,(x)) > 
1 — 1/p2(g). 

If p,q = 3mod4, we have —1 ¢ QR,, (see Theorem A.53). Therefore, 
either a or —a € QR, for (£) = 1. We are looking for m multipliers @ of the 
form a;+ia,_1 +6 with (4) = 1, where m := min(2%e~?, 2ke~). If a € QR, 
Lsb(@z) can be computed with algorithm Aj, and if —@a = n—a@ € QR,, 
Lsb(ax) = 1 — Lsb(—ax) and Lsb(—Gz) can be computed with algorithm 
A,. Lsb(aax) is correctly computed if A; correctly computes the predicate 
Lsb and Ag from Proposition 7.20 correctly computes the predicate PQR,. 
Both events are independent. Thus Lsb(@z) is computed correctly with a 
probability > (1/2+ !/p(ry) (1 — '/p2(e)) > C/2+ Y/2P(k)). Thus we set 
e=!/oP(k). 

With 7 varying in an interval, the fraction of the multipliers a; + ta;_, + 
b which are in J! differs from 1/2 only negligibly, because J+ is nearly 
uniformly distributed in Z* (see [Peralta92]). 

We double the range for i, take i € [—m,m — 1], and halve the distances 
of the initial rational approximations: 


En 


3 
lax mod n —un| < + and |ba mod n — un| < 16” 


Now we obtain the same estimates as in the proof of Theorem 7.14. 


Corollary 7.23. Let I := {n | n = pq, p and q distinct primes, |p| = |ql, 
p,q = 3mod 4}. Provided that the factorization assumption is true, 


QRLsb = (QRLsb,, : QR,, — {0,1}, «+ Lsb(x)), <7 
is a family of hard-core predicates for 


Square = (Square, : QR,, — QR 


Bi— wo?) ps 


nm? 


Proof. The proof is analogous to the proof of Corollary 7.11. Observe that 
the ability to compute square roots modulo n is equivalent to the ability to 
compute the prime factors of n (Proposition A.64). 
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Exercises 
1. Compute Log, ,(17) using Algorithm 7.6, for p := 19 and g := 2 (note 
that 2, 4 and 13 are principal square roots). 


2. Let p be an odd prime number, and g be a primitive root in Zj. 
For y := Exp, ,(x), we have 


-1 
Bp.g(y) = 0 if and only if 0 <a < Pp, 


-—1 -—1 3(p—1 
a and © <a< P- M 


Bp,g(y”) = 0 if and only if0 <a < 


and so on. Let A; be a a deterministic polynomial algorithm such that 
Ai(p, 9,y) = Bp,g(y) for all y € Z5. 
By using a binary search technique, prove that there is a deterministic 
polynomial algorithm A» such that 


Ap (p, 9, y) = Loe, (y) 


for all y € Zp. 


3. Let p be an odd prime number. Suppose that p—1 = 2°a, where a is odd. 
Let g be a primitive root in Z>: 
a. Show how the t least-significant bits (bits at the positions 0 to t — 1) 
of x € Zp_, can be easily computed from g* = Exp, (2). 
b. Denote by Lsb;() the ¢-th least-significant bit of x (bit at position t 
counted from the right, beginning with 0). Let A; be a deterministic 
polynomial algorithm such that 


Ai(p, 9, 9°) = Lsb;(x) 


for all « € Z,_1. By using Aj, construct a deterministic polynomial 
algorithm A» such that 


Ag (p, 9; y) a Log, g (y) 


for all y € Z. (Here assume that a deterministic algorithm for com- 
puting square roots exists.) 
c. Show that Lsb; yields a hard-core predicate for the Exp family. 


4. As in the preceding exercise, Lsb; denotes the j-th least-significant bit. 
a. Let A; be a deterministic polynomial algorithm such that for all 
LE Zy-1 


Ai(p, g; g° , Lsb;(x), ieee , Lsbe45-1(2)) = Lsb+j(2), 


where p is an odd prime, g € Z;, is a primitive root, p— 1 = 2*a, ais 
odd, k = |p| and j € {0,..., |logs(k) |}. 
Construct a deterministic polynomial algorithm A» such that 
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Ao(p, 9, y) _ Log, 4(¥) 


for all y € Z). 
b. Let P,Q € Z[X] be positive polynomials and A, be a probabilistic 
polynomial algorithm such that 


prob( Ai (p, 9, g’ , Lsb;(2), eas , Lsby45—1(2)) 


u 1 1 
=> Lsb;+; (x) ae ba co Zy—1) = 2 + P(k) 5 
where p is a an odd prime, g € Z7, is a primitive root, p—1= Qa, a 


is odd, k = |p| and j € {0,..., [logo(k) |}. 
Construct a probabilistic polynomial algorithm A» such that 


prob(Aa(p, 9, y) = Log, q(¥)) Pra a 2-2) 
for all y € Z). 


5. Let I := {(p,g) | p an odd prime, g € Z} a primitive root} and I, := 
{(p,g) € I | |p| = k}. Assume that the discrete logarithm assumption 
(see Definition 6.1) is true. 

Show that for every probabilistic polynomial algorithm A, with inputs 
p,9,Y,b0,---,0j;-1, 1 < 7 < [loga(k)|, and for every positive polynomial 
P, there is a kg € N such that 


prob(A(p, g, 9”, Lsb;(z),..., Lsby+;-1(x)) 
1 


= Lsby4;(2) : (p,g) © In,@ — Zp1) < P(k) 


1 

at 
for k > ko and for all 7 € {0,..., |log,(k)|}. t is defined by p — 1 = 2*a, 
with a odd. 
In particular, the predicates Lsbi,;,0 < 7 < |[log,(k)|, are hard-core 
predicates for Exp. 


6. Compute the rational approximation (a, wu) for 13 € Zag. 


7. Let p := 17,q := 23,n := pq and e := 3. List the least-significant bits 
that A, will return if you compute RSA; ¢(49) using Algorithm 7.12 
(note that 49 = 196° mod n). 


8. Let n := pq, with p and q distinct primes and e relatively prime to y(n), 
x € Z* and y := RSA,,-(a) = «© mod n. Msb is defined analogously to 
Definition 7.1. Show that you can compute Msb(z) from y if and only if 
you can compute Lsb(x) from y. 


9. Show that the most-significant bit of x is a hard-core predicate for the 
RSA family, provided that the RSA assumption is true. 


10. Use the most significant bit and prove Proposition 7.12 using a binary 
search technique (analogous to Exercise 2). 
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11. RSA inversion by binary division ([GolMicTon82]). Let y := RSAy,-(x) = 
x° mod n. Let A be an algorithm that on input y outputs Lsb(x). Let 
k := |n|, and let 2~° € Z* be the inverse of 2° € Z*. Compute x from y 
by using the bit vector (bp_-1,..., bo), defined by 


Yor Ys 
bo = A(yo), 

_ | ywi-12~* mod n if b;_1 =0, 
a (n — y;-1)2~© mod n_ otherwise, 


b; = A(yi), for 1 <i<k. 
Describe the algorithm inverting RSA and show that it really does invert 
RSA. 
12. a. Let A, be a deterministic polynomial algorithm such that 


Ai(n,e, 2°, Lsbo(z),..., Lsbj;_1(#)) = Lsb, (2) 


for all x € Z*, where n := pq, p and q are odd distinct primes, e is 
relatively prime to y(n), k := |n| and j € {0,..., |logo(k) |}. 
Construct a deterministic polynomial algorithm A» such that 


Ao(n,e, 2°) = x 


for all x € Z*. 
b. Let P,Q € Z[X] be positive polynomials and A, be a probabilistic 
polynomial algorithm, such that 


prob(Ai(n, e, x°, Lsbo(z),..., Lsbj—1(z)) 


= Lsbj(x): a & Z*) > — + —. 


where n := pq, p and q are odd distinct primes, e is relatively prime 


to y(n), k := |n| and 7 € {0,..., |logy(k) |}. 
Construct a probabilistic polynomial algorithm A» such that 


prob(Ao(n,e, 2°) = « > 1— 272) 


for all x € ZF. 


13. Let I := {(n,e) | n = pq, p # g prime numbers, |p| = |q|, e < y(n), e 
prime to y(n)} and Ip := {(n,e) € I | n = pg, |p| = |q| = k}. Assume 
that the RSA assumption (see Definition 6.7) is true. 

Show that for every probabilistic polynomial algorithm A, with inputs 
n,é,y,60,...,bj-1,0 <7 < [loga(|n|)|, and every positive polynomial P, 
there is a kg € N such that 
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prob(A(n, e, 2°, Lsbo(x),..., Lsb;_1(a)) 


u u 1 1 
= Lsbj(2) (me) “Lhe 25) <5 + Bap 


for k > ko and for all 7 € {0,..., [logs(|n|) |}. In particular, the predicates 
Lsb;,0 < 7 < [logs(|n|)|, are hard-core predicates for RSA. 


8. One-Way Functions and Pseudorandomness 


There is a close relationship between encryption and randomness. The se- 
curity of encryption algorithms usually depends on the random choice of 
keys and bit sequences. A famous example is Shannon’s result. Ciphers with 
perfect secrecy require randomly chosen key strings that are of the same 
length as the encrypted message. In Chapter 9, we will study the classical 
Shannon approach to provable security, together with more recent notions of 
security. One main problem is that truly random bit sequences of sufficient 
length are not available in most practical situations. Therefore, one works 
with pseudorandom bit sequences. They appear to be random, but actually 
they are generated by an algorithm. Such algorithms are called pseudoran- 
dom bit generators. They output, given a short random input value (called the 
seed), a long pseudorandom bit sequence. Classical techniques for the genera- 
tion of pseudorandom bits or numbers (see [Knuth98]) yield well-distributed 
sequences. Therefore, they are well-suited for Monte Carlo simulations. How- 
ever, they are often cryptographically insecure. For example, in linear con- 
gruential pseudorandom number generators or linear feedback shift registers 
(see, e.g., [MenOorVan96]), the secret parameters and hence the complete 
pseudorandom sequence can be efficiently computed from a small number of 
outputs. 

It turns out that computationally perfect (hence cryptographically secure) 
pseudorandom bit generators can be derived from one-way permutations with 
hard-core predicates. We will discuss this close relation in this chapter. The 
pseudorandom bit generators G studied are families of functions whose in- 
dexes vary over a set of keys. Before we can use G, we have to select such a 
key with a sufficiently large security parameter. 

Of course, even applying a perfect pseudorandom generator requires start- 
ing with a truly random seed. Thus, in any case you need some “natural” 
source of random bits, such as independent fair coin tosses (see Chapter 5). 


8.1 Computationally Perfect Pseudorandom Bit 
Generators 


In the definition of pseudorandom generators, we use the notion of polynomial 
functions. 
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Definition 8.1. We call a function |: N— N a polynomial function if it is 
computable by a polynomial algorithm and if there is a polynomial Q € Z[X], 
such that [(k) < Q(k) for all k EN. 


Now we are ready to define pseudorandom bit generators. 


Definition 8.2. Let I = (J,)zen be a key set with security parameter k, and 
let K be a probabilistic polynomial sampling algorithm for J, which on input 
1* outputs an i € J,. Let 1 be a polynomial function. 

A pseudorandom bit generator with key generator K and stretch function 
lis a family G = (G;)ier of functions 


G;: X; — {0,1}") (ie I,), 


such that 


1. G is computable by a deterministic polynomial algorithm G: 
G(i, x) = G,(x) for alli € I and x € Xj. 

2. There is a uniform sampling algorithm S$ for X := (Xj)ier, which on 
input 2 € J outputs x € Xj. 


The generator G is computationally perfect (or cryptographically secure), if 
the pseudorandom sequences generated by G cannot be distinguished from 
true random sequences by an efficient algorithm; i.e., for every positive poly- 
nomial P € Z[X] and every probabilistic polynomial algorithm A with inputs 
i € Ip, z € {0,1}! and output in {0,1}, there is a ko € N such that for all 
k > ko 


| prob(A(i, z) =1:i-— K(1"),z & {0,1}4) 
— prob(A(i, G;(x)) =1:i— K(1*),2 & X;)|< _ 


Remarks: 


1. The probabilistic polynomial algorithm A in the definition may be con- 
sidered as a statistical test trying to compute some property which dis- 
tinguishes truly random sequences in {0,1}!“") from the pseudorandom 
sequences generated by G. Classical statistical tests for randomness, such 
as the Chi-square test ([Knuth98], Chapter 3), can be considered as such 
tests and can be implemented as polynomial algorithms. Thus, “compu- 
tationally perfect” means that no statistical test — which can be imple- 
mented as a probabilistic algorithm with polynomial running time — can 
significantly distinguish between true random sequences and sequences 
generated by G, provided a sufficiently large key 7 is chosen. 

2. By condition 2, we can randomly generate uniformly distributed seeds x € 
X; for the generator G. We could (but do not) generalize the definition 
and allow non-uniform seed generators S (see the analogous remark after 
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the formal definition of one-way functions — Definition 6.12, remark 2). 
The constructions and proofs given below also work in this more general 
case. 

3. We study only computationally perfect pseudorandom generators. There- 
fore, we do not specify any other level of pseudorandomness for the se- 
quences generated by G. In the literature, the term “pseudorandom gen- 
erator” is sometimes only used for generators that are computationally 
perfect (see, e.g., [Goldreich99]; [Goldreich01}). 

4. Our definition of computationally perfect pseudorandom generators is 
a definition in the “public-key model”. The key 7 is an input to the 
statistical tests A (which are the adversaries). Thus, the key 7 is assumed 
to be public and available to everyone. Definition 8.2 can be adapted 
to the “private-key model”, where the selected key 7 is kept secret, and 
hence is not known to the adversaries. The input 2 of the statistical tests 
A has to be omitted. We only discuss the public-key model. 

5. Admissible key generators can be analogously defined as in Definition 
6.13. 

6. The probability in the definition is also taken over the random generation 
of a key 1, with a given security parameter k. Even for very large k, there 
may be keys 7 such that A can successfully distinguish pseudorandom 
from truly random sequences. However, when generating a key 7 by Kk, 
the probability of obtaining one for which A has a significant chance of 
success is negligibly small (see Exercise 8 in Chapter 6). 

7. As is common, we require that the pseudorandom generator can be im- 
plemented by a deterministic algorithm. However, if the sequences can be 
computed probabilistically in polynomial time, then we can also compute 
them almost deterministically: for every positive polynomial Q, there is 
a probabilistic polynomial algorithm G(2, 2) with 


prob(G(i,z) = Gi(x)) >1-2-@ (ie) 


(see Proposition 5.6 and Exercise 5 in Chapter 5). Thus, a modified def- 
inition, which relaxes condition 1 to “Monte-Carlo computable”, would 
also work. In all our examples, the pseudorandom sequences can be effi- 
ciently computed by deterministic algorithms. 


We will now derive pseudorandom bit generators from one-way permuta- 
tions with hard-core predicates (see Definition 6.15). These generators turn 
out to be computationally perfect. The construction was introduced by Blum 
and Micali ([BluMic84]). 


Definition 8.3. Let I = (Ip)xen be a key set with security parameter k, and 
let Q € Z[X] be a positive polynomial. 

Let f = (fi : Di — Di)icr be a family of one-way permutations with hard- 
core predicate B = (B;: D; — {0,1})ier and key generator K. Then we 
have the following pseudorandom bit generator with stretch function Q and 
key generator K: 
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G := G(f, B,Q) := (Gi: Di — {0,1}°) penser, ; 


x € Di (Bi(2), Bi(file)), Bi(fP(@)),--- BFP O"(@))). 
We call this generator the pseudorandom bit generator induced by f,B and 
Q. 


Remark. We obtain the pseudorandom bit sequence by a very simple construc- 
tion: choose some random seed « “ D;. Compute the first pseudorandom bit 
as B;(x), apply f; and get y := f;(a). Compute the next pseudorandom bit 
as B;(y), apply f; to y and get a new y := f;(y). Compute the next pseudo- 
random bit as B;(y), and so on. 


Examples: 


1. Provided that the discrete logarithm assumption is true, the discrete 
exponential function 


Exp = (Exp, , : Zp-1 —> Zp, £— 9*)(p,ger 
with I := {(p,g) | p prime, g € Zj a primitive root} is a bijective one- 
way function, and the most-significant bit Msb,(a) defined by 


Ofor0<2< 2, 
lfor = <2<p-l, 


Msb,(z) = 
is a hard-core predicate for Exp (see Section 7.1). Identifying Z,_1 with 
Z,, in the straightforward way,’ we may consider Exp as a one-way per- 
mutation. The induced pseudorandom bit generator is called the discrete 
exponential generator (or Blum-Micali generator). 

2. Provided that the RSA assumption is true, the RSA family 


RSA = (RSAn,¢ : Z, — Zh, &+-> 2°) (neyer 


n? 


with I := {(n,e) | n = pq, p,q distinct primes, |p| = |q|,e prime to y(n)} 
is a one-way permutation, and the least-significant bit Lsb,,(2) is a hard- 
core predicate for RSA (see Section 7.2). The induced pseudorandom bit 
generator is called the RS'A generator. 

3. Provided that the factorization assumption is true, the modular squaring 
function 


Square = (Square,, : QR,, —> QR, © = 2?)ner 


with I := {n | n= pq, p,q distinct primes, |p| = |q|,p,q = 3 mod 4} isa 
one-way permutation, and the least-significant bit Lsb,,(x) is a hard-core 
predicate for Square (see Section 7.3). The induced pseudorandom bit 


' Zp-1 = {0,...,p—2} —> Z*¥ ={l,...,p 1,0 p-l,ar a2 forl<a<p-2. 
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generator is called the (x? mod n) generator or Blum-Blum-Shub gener- 
ator. 

Here the computation of the pseudorandom bits is particularly simple: 
choose a random seed x € Z*, repeatedly square x and reduce it by 


modulo n, take the least-significant bit after each step. 


We will now show that the pseudorandom bit generators, induced by one- 
way permutations, are computationally perfect. For later applications, it is 
useful to prove a slightly more general statement that covers the pseudoran- 
dom bits and the seed, which is encrypted using rues 


Theorem 8.4. Let I = (Ik)ken be a key set with security parameter k, and 
let Q € Z[X] be a positive polynomial. Let f = (f; : Di —> Di)ier be a family 
of one-way permutations with hard-core predicate B = (B; : D; — {0,1})ier 
and key generator K. Let G := G(f, B,Q) be the induced pseudorandom bit 
generator. 

Then, for every probabilistic polynomial algorithm A with inputs i © Ip, z € 
{0,1}2),y € D; and output in {0,1}, and every positive polynomial P € 
Z|X], there is a ko € N such that for all k > ko 


| prob(A(i, G(x), f° (x)) =1:4— K(1*),2 & D,) 
U U 1 
~ prob(A(i, z,y) = 1:8 K(1‘),2 & {0,19 y & Di)| < Be. 
Remark. The theorem states that for sufficiently large keys, the probability 
of distinguishing successfully between truly random sequences and pseudo- 
random sequences — using a given efficient algorithm — is negligibly small, 
even if the encryption ro ®) (x) of the seed a is known. 


Proof. Assume that there is a probabilistic polynomial algorithm A, such 
that the inequality is false for infinitely many k. Replacing A by 1 — A if 
necessary, we may drop the absolute value and assume that 


prob(A(i, Gi(z), f° (x) =1:1— K(1*),2 & Dj) 


— prob(A(i, 2,9) = 1:8 — K(*),2 © {0,1}°,y & Di) > a, 
for k in an infinite subset K of N. 
For k € K and i € Ix, we consider the following sequence of distributions 
Di,0, Pil, +++» Pi,Qq(h) on Zi := {0, 1}@ x Dj:? 


? We use the notation for image distributions introduced in Appendix B.1 (p. 


330): pi» is the direct product of the uniform distribution on {0,1}?%-" 
with the image of the uniform distribution on D; under the mapping x 


(Bi(x), Bi(fi(x)),.-., Bf? *(2)), f7 (2). 
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pio := {(b1,--- bay ¥) : (b1,--- Bacay) © {0, 2°, y & Dj} 
Dir = {(b1,---, ba¢e)—1, Bi(x), fala) : 

(b1,.--,ba(e)—1) & {0,1}°-1, « & Dj} 
Pi,2 = {(b1,---, ba(K)— 2, Bi(x), Bi( fi(x )), f2(@)) : 

(b1,...,da(K)-2) © {0, 1}°%)?, « & Dj} 


Die = {(biy +, bo@-a Bil), Bl fi@) a Bist “@), fF @)« 
(b1,-.-,bo(K)—-r) — {0,1}°M-", & & Dj} 


Piqany = {(Bila), Bilfila)),---, BFP (@), FP (a) : & & Dy}. 


We start with truly random bit sequences. In each step, we replace one 
more truly random bit from the right with a pseudorandom bit. The seed 
x encrypted by f7 is always appended on the right. Note that the image 
{fi(z): a “ D;} of the uniform distribution under f; is again the uniform 
distribution, since f; is bijective. Finally, in p; g(x) we have the distribution 
of the pseudorandom sequences supplemented by the encrypted seed. We 
observe that 


prob(A(i, z,y) =1:i— K(1*),z = {0, 1} 2") y “ Di) 
= prob(A(é, zy) =1:8— K (1), (2,9) 22 Zi) 
and 
prob(A(i, G;(x), f° (#)) =1:i — K(1"),a Dj) 
= prob(A(i,z,y) = 1:4 — K(1"), (z,y) "2 Z,). 

Thus, our assumption says that for k € K, the algorithm A is able to dis- 
tinguish between the distribution p;g(,) (of pseudorandom sequences) and 
the (uniform) distribution p; 9. Hence, A must be able to distinguish between 


two subsequent distributions p,;, and p;,,+1, for some r. 
Since f; is bijective, we have the following equation (8.1): 


Bir = {(b1,---,bQ(e)-r, Bi(2), Bi(fi(a)), --- BiF7 (2), FF (@)) § 
(b1,..-,ba(K)-r) = {0, 1}2)-" ¢ & Dj} 
= {(b1,.-- boa) Bil fila), Bi F7Z(2)),--.. BCH (@)), "(@)) : 
eer 1}2H)-" og & Dy}. 


We see that p;,, differs from p;,-+1 only at one position, namely at position 
Q(k) — r. There, the hard-core bit B;(a) is replaced by a truly random bit. 
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Therefore, algorithm A, which distinguishes between p;,, and p;,-+1, can also 
be used to compute B;(x) from f;(x). 

More precisely, we will derive a probabilistic polynomial algorithm A(i, y) 
from A that on inputs i € J, and y := f;(x) computes B;(x) with probability 
>Vot 1/P(k)Q(k): for the infinitely many k € K. This contradiction to the 
hard-core property of B will finish the proof of the theorem. 

For k € K, we have 


1 e Ote 
—~ < prob(A(i,z,y) =1:i— K(1*), (z,y) "2 2) 


P(k) 
— prob(A(i, z,y) = 1:1 — K(1*), (z,y) © Zi) 
Q(k)=1 
= S> (prod(AG,z,y) = 1:4 K(1'), (z,y) "" Z) 
r=0 


— prob(A(i, z,y) = 1:4 — K(1'), (zy) Z,)). 


Randomly choosing r, we expect that the r-th term in the sum is 
> YP(K)Q(h). : 
On inputs i € Ix, y € Dj, the algorithm A works as follows: 


1. Choose r, with 0 <r < Q(k), uniformly at random. 

2. Independently choose random bits 61, be,...,b@(%)—r—1 and another ran- 
dom bit 6. 

3. For y = fi(x) € Dj, let 


A(i,y) = AG, file) 
b if AUG, Bigs 35 BOGE 
= BG@ ies Baas. (@)) =a 


1— 0 otherwise. 


If A distinguishes between p;,, and p;,-+1, it yields 1 with higher probability 
if the (Q(k) —r)-th bit of its input is B;(x) and not a random bit. Therefore, 
we guess in our algorithm that the randomly chosen 0 is the desired hard-core 
bit if A outputs 1. 

We now check that A indeed computes the hard-core bit with a non- 
negligible probability. Let R be the random variable describing the choice 
of r in the first step of the algorithm. Since r is selected with respect to 
the uniform distribution, we have prob(R = r) = Y/@Q(k) for all r. Applying 
Lemma B.13, we get 
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prob(A(i, fi(x)) = Bi(x) :i — K(1*),a © Dj) 
= 5 + prob(Ali, fi(e)) = 0] Bi(a) = b) — prob( Ali, f(a) = 6) 
1 Q(k)-1 
= aa d prob(R ) + (prob(A A(i, fi(x)) = b| B(x) = b,R=r) 

— prob(A(i, fi(x)) = b| R =r) 
as le ; cj 
= + lh) (prob(A(i, fi(x)) = b| Bi(z) = b: i — K(1*), 2 & Dj) 

— prob(A(i, fi(x)) = 6: i — K(1*),@ & Dj) 
1 1 Q(k)-1 eke 
= 3 + Ok) (prob(A(i, z, y) S1liie K(1*); (z,y) = Zi) 


— prob(A(i, z,y) =1:i— K(1*),(z,y)& Z;)) 


for the infinitely many k € K. The probabilities in lines 2 and 3 are computed 
with respect to i — K(1") and « “ D; (and the random choice of the 
elements b;, b, r). Since r is chosen independently, we can omit the conditions 
R =r. Taking the probability prob(A(é, fi(x)) = b| Bi(x) = b) conditional 
on B;(x) = b just means that the inputs to A in step 3 of the algorithm A 
are distributed according to p;,-+1. Finally, recall equation (8.1) for p;,- from 
above. 

Since B is a hard-core predicate, our computation yields the desired con- 
tradiction, and the proof of Theorem 8.4 is complete. 


Corollary 8.5 (Theorem of Blum and Micali). Pseudorandom bit generators 
induced by one-way permutations with hard-core predicates are computation- 
ally perfect. 


Proof. Let A(i,z) be a probabilistic polynomial algorithm with inputs i € 
Ix, z € {0,1}2™) and output in {0,1}. We define A(i, z,y) := A(i,z), and 
observe that 
prob(A(i,z,y) =1:i- k,z< {0, 1}24) y — D;) 
= prob(A(i,z) =1:i<— Ik, z — {0,1}?), 


and the corollary follows from Theorem 8.4 applied to A. 
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8.2 Yao’s Theorem 


Computationally perfect pseudorandom bit generators such as the ones in- 
duced by one-way permutations are characterized by another unique feature: 
it is not possible to predict the next bit in the pseudorandom sequence from 
the preceding bits. 


Definition 8.6. Let I = (Ip)xen be a key set with security parameter k, and 
let G = (Gi: X; — {0,1}"™) <7 be a pseudorandom bit generator with 
polynomial stretch function | and key generator K: 


1. 


Here and in what follows we denote by G 


A neat-bit predictor for G is a probabilistic polynomial algorithm 
A(i, z1...2,) which, given 7 € I, outputs a bit (“the next bit”) from r 
input bits z; (0 < r <I(k)). 


. G passes all next-bit tests if and only if for every next-bit predictor A 


and every positive polynomial P € Z[X], there is a ko € N such that for 
all k > ko and allO <r < I(k) 


prob(A(i, Gii(x)...Gir(x)) = Girii(x) :i — K(1*),2 & X;) 
1 1 
27 PR) 


generated by Gi: 


Remarks: 


1. 


2: 


A next-bit predictor has two inputs: the key i and a bit string z,...z, 
of variable length. 

As usual, the probability in the definition is also taken over the random 
choice of a key 7 with security parameter k. This means that when ran- 
domly generating a key i, the probability of obtaining one for which A 
has a significant chance of predicting a next bit is negligibly small (see 
Proposition 6.17). 


Theorem 8.7 (Yao’s Theorem). Let I = (Iy)pen be a key set with security 
parameter k, and let G = (G; : X; —> {0,1}!));er be a pseudorandom bit 
generator with polynomial stretch function | and key generator K. 

Then G is computationally perfect if and only if G passes all next-bit tests. 
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Proof. Assume that G is computationally perfect and does not pass all 
next-bit tests. Then there is a next-bit predictor A and a positive poly- 
nomial P, such that for k in an infinite subset K of N, we have a position 
Tr, O< Tre <U(k), with qu, > 2+ YP(k), where 

de,r = prob(A(i, Gii()...Gir(2)) = Gi rai (x) :ic K(1*),2 & X;). 

By Proposition 6.18, we can compute the probabilities q,,-, r = 0,1,2,..., 
approximately with high probability, and we conclude that there is a proba- 
bilistic polynomial algorithm R which on input 1* finds a position where the 
next-bit predictor is successful: 


i. At 1 
k { > 2 
peop (1,00 ier 3 ar) = 1 PMH 


We define a probabilistic polynomial algorithm A (a statistical test) for 
inputs 7 € I and z = (z1,..., 2a) € {0,1} as follows: Let r := R(1") and 
set 


AG, 2) “3 1 if 2-41 = A(t, 21... 2), 


0 otherwise. 
For truly random sequences, it is not possible to predict a next bit with 
probability > 1/2. Thus, we have for the uniform distribution on {0,1}! 
that 
A; . k u 1(k) 1 
prob(A(i, z) = 1:1— K(1%),z<— {0,1}°%) < 5° 
We obtain, for the infinitely many k € K, that 


| prob(A(i, Gi(z)) =1:i-— K(1"),¢“ X,) —- 
prob(A(i,z) =1:i— K(1*),z = {0,1}) | 


4 (1 aH! j (5 Ta 5) Fm 


IP(k)  8P2(k) = 8P(R’ 


which is a contradiction to the assumption that G is computationally perfect. 

Conversely, assume that the sequences generated by G pass all next-bit 
tests, but can be distinguished from truly random sequences by a statistical 
test A. This means that 


| prob(A(i, Gi(z)) = 1:i — K(1*),2 & X;) 


— prob(A(i, z) =1:i— K(1*),z & {0,1})| > : 


P(k)’ 


for some positive polynomial P and k in an infinite subset K of N. Replacing 
A by 1-— A, if necessary, we may drop the absolute value. 
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The proof now runs in a similar way to the proof of Theorem 8.4. For 
k €K andi € Ix, we consider a sequence pj,9, Pi,1,--+;Pi,l(z) Of distributions 
on {0,1} !(*); 


pio = {(b1,---,byny) ¢ (b1,-- 5 bay) & {0, 1}'} 
pir = {(Gir(x), be, ..., buey) : (b2,---, bay) & {0, 1H H-3 & & Xi} 
pia = {(Gi1(x), Gia(x), b3,..., dicey) 

(ba,...,biny) — {0,13 -?, 2 & X4} 


Dir = {(Gi,1(2), Gia(z),..., Gir(x), bryi,..-, biky) : 
(bri, +++, biny) — {0, iO" 2 & Xi} 


Pi,l(k) = {(Gii(2), Gi2(x), eaey Gii(k) (x)) na Os e X;}. 


We start with truly random bit sequences, and in each step we replace one 
more truly random bit from the left by a pseudorandom bit. Finally, in p; 1x) 
we have the distribution of the pseudorandom sequences. 

Our assumption says that for k € K, algorithm A is able to distinguish 
between the distribution p; 1) (of pseudorandom sequences) and the (uni- 
form) distribution p;o. Again, the basic idea of the proof now is that A must 
be able to distinguish between two subsequent distributions p;, and pi,r+1, 
for some r. However, p;,-+1 differs from p;,, in one position only, and there a 
truly random bit is replaced by the next bit G;,,41(x) of the pseudorandom 
sequence. Therefore, algorithm A can also be used to predict Gj,-41(2). 

_ More precisely, we will derive a probabilistic polynomial algorithm 
A(i, Z1,...,2r) that successfully predicts the next bit Gi,41(x) from 
Gii(x), Gi2(x),...,Gir(a) for some r = rx, for the infinitely many k € K. 
This contradiction to the assumption that G passes all next-bit tests will 
finish the proof of the theorem. 

Since A is able to distinguish between the uniform distribution and the 
distribution induced by G, we get for k € K that 


ae : mee, ee k Pi, i(k) 1(k) 
Pir) < prob(A(i,z) =1:7<-— K(1"),z2 — {0,1}'&) 
— prob(A(i, z) =1:i<— K(1*), 22 {0,1}!) 
U(k)—1 
= © (prob(A(i,z) =1:1— K(1"), 2" {0,1}) 
r=0 
— prob(A(i,z) =1:i— K(1*), 2" {0,1}"))). 


We conclude that for k € K, there is some rz, 0 < ry < I(k), with 
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1 isn 
BODIE <PrOP(AGE 2) = 7 KN), 2" f0,1)') 
— prob(A(i, z) =1:i<— K(1*), 2 "&* {0, 1}!). 
This means that A(i, z) yields 1 for 
2% = (Gye), Gi3(@) ys Gam, (@) 0 Or to,2 ss Oi) 


with higher probability if 6 is equal to Gj,-,,41(z) and not a truly random bit. 
On inputs i € In, 21... 2 (0 <r < U(k)), algorithm A is defined as follows: 


1. Choose truly random bits b, b;42,..., by), and set 
B= (Ay os Mr yO Dpto ves Orgy) 
2. Let 
Ateoel {ty atayoo 
Applying Lemma B.13, we get 
prob(A(i, Gii(x) ... Gir, (£)) = Gir, 41(2): i K(1*),2 & X;) 
= ; + prob(A(i, Gii(a)...Gir, (2)) = 0 | 
Girpyi(@) = b: ic K(1*), 2 & Xj) 
— prob(A(i, Gii(2)...Gir,(£)) =b:i — K(1*), 2 & Xj) 


ans prob(A(i,z) = 1:1 K(1*),2° 2" {0,1}'}) 


2 
— prob(A(i, z) =1:1<— K(1*), z°<* {0,1}') 
et eee ee 
2° P(k)l(k)’ 


for the infinitely many k € K. This is the desired contradiction and completes 
the proof of Yao’s Theorem. 


Exercises 


1. Let I = (Ig)ken be a key set with security parameter k, and let 
G = (G;)ier be a computationally perfect pseudorandom bit generator 
with polynomial stretch function 1. Let 7 = (7;)ier be a family of permu- 
tations, where 7; is a permutation of {0,1}') for i € I,. Assume that 7 
can be computed by a polynomial algorithm J, i.e., 7;(y) = H(i, y). Let 
1 o0G be the composition of 7 and G: 4 7;(G;(a)). 

Show that ao G is also a computationally perfect pseudorandom bit 
generator. 
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2. Give an example of a computationally perfect pseudorandom bit gener- 
ator G = (G;)ie7 and a family of permutations 7, such that 7 o G is not 
computationally perfect. 

(According to Exercise 1, 7 cannot be computable in polynomial time.) 


3. Let G = (Gj)ier be a pseudorandom bit generator with polynomial 
stretch function J and key generator Kk. 
Show that G is computationally perfect if and only if next bits in the 
past cannot be predicted; i.e., for every probabilistic polynomial algo- 
rithm A(i, 2-41... 2p)) Which, given i € I;, outputs a bit (“the next bit 
in the past”) from I(k) — r input bits z;, there is a ko € N such that for 
all k > ko and all 1 < r < I(k) 


prob(G;,-(x) => A(i, Gir+i(2) fond Gi(k) (x)) be K(1*), x & X;) 


4. Let Q bea positive polynomial, and let G = (G;);-7 be a computationally 
perfect pseudorandom bit generator with 


G;: {0,1}° —. {0,1}9)+1 Ge), 


i.e., G extends the binary length of the seeds by 1. Recursively, define 
the pseudorandom bit generators G! by 


Gi=G, Giz) = (Gir(e), G"(Gi2(@),---; Gaus). 


As before, we denote by Gi. ,(a) the j-th bit of Gi(x). Let 1 vary with 
the security parameter k, ie., ] = I(k), and assume that 1: N—- Nisa 
polynomial function. 

Show that G! is computationally perfect. 


5. Prove the following stronger version of Yao’s Theorem (Theorem 8.7). 

Let I = (Ik)ren be a key set with security parameter k, and let 

G = (G;: X; — {0,1}!),¢; be a pseudorandom bit generator with 

polynomial stretch function | and key generator K. 

Let f = (fi : Xi — Yi)ier be a Monte-Carlo computable family of maps. 

Then, the following statements are equivalent: 

a. For every probabilistic polynomial algorithm A(Z, y, z) and every pos- 

itive polynomial P, there is a kg € N such that for all & > ko and all 
0<r<lI(k) 


prob(Gi-41(2) = A(d, fi(z), Gii(x)...Gi,(x)) : 
i-— K(1*),2 & Xj) 
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b. For every probabilistic polynomial algorithm A(i, y, z) and every pos- 
itive polynomial P, there is a kg € N such that for all k > ko 


prob(A(i, f;(x),z) =1:i<— K(1"),a @ X;,z “ {0,1}'™) 


uU 


— prob(A(i, fi(x), Gi(a)) =1:ti<— K(1*),a — X;)| 


1 
oS 


(k) 


In this exercise a setting is modeled in which some information f;(2) 
about the seed x is known to the adversary. 


. Let f = (fi: Di —> Ri)ier be a family of one-way functions with key 


generator K, and let B = (B;: D; — {0,1}"));<7 be a family of I- 
bit predicates which is computable by a Monte Carlo algorithm (1 a 
polynomial function). Let By = (Bi1, Bi,2,.--, Bian)). We call B an 1-bit 
hard-core predicate for f (or simultaneously secure bits of f), if for every 
probabilistic polynomial algorithm A(t, y, 21,...,2;) and every positive 
polynomial P, there is a kg € N such that for all k > ko 


| prob(A(i, fi(x), Bia(x),... , Bir) (2)) Se pe K(1*),a eh Di) 
— prob(A(i, fi(x),z) =1:i- K(1*),¢ & Dy, z = {0,1}) | 
1 


= P(k) 


For | = 1, the definition is equivalent to our previous Definition 6.15 of 
hard-core bits (see Exercise 7 in Chapter 6). 

Now assume that B is an [bit hard-core predicate, and let C = 
(C; : {0,1}! —> {0,1})ier be a Monte-Carlo computable family of 
predicates with prob(C;(#) = 0: a “ {0,1}!*)) = 1/p for alli € I. 

Show that the composition Co B,x € Dj; — C;(B;(x)), is a hard-core 
predicate for f. 


. Let f = (fi: Di —> Ri)ier be a family of one-way functions with key 


generator K, and let B = (B; : D; — {0,1}!)),e7 be a family of I-bit 
predicates for f. Let Bj = (Bi1, Bi2,--.,Biacn)). Assume that know- 
ing fi(z) and By 1(a),...,Bi,;-1(x) does not help in the computation of 
B;,;(z). More precisely, assume that for every probabilistic polynomial 
algorithm A(z, y,z) and every positive polynomial P, there is a kg € N 
such that for all 1 <j < I(k) 


prob(A(i, fila), Bia(2) eee Byj-1(2)) = Byj(2) be rai haae x ya X;) 
1 1 
< 5 + Pl) 


(In particular, the (B;,;)ier are hard-core predicates for f.) 
Show that the bits B;1,...,B;) are simultaneously secure bits for f. 
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Examples: 

The |log,(|n|)| least-significant bits are simultaneously secure for the 
RSA (and the Square) one-way function (Exercise 12 in Chapter 7). If 
p is a prime, p— 1 = 2'a and a is odd, then the bits at the positions 
t,t+1,...,¢+ |logs(|p|)| (counted from the right, starting with 0) are 
simultaneously secure for the discrete exponential one-way function (Ex- 
ercise 5 in Chapter 7). 


. Let f = (fi: Di — Di)ier be a family of one-way permutations with 
key generator K, and let B = (B,)jcy be an I-bit hard-core predicate 
for f. Let G be the following pseudorandom bit generator with stretch 
function 1Q (Q a positive polynomial): 


G := (G,: D; — {0, Ta een 


x € Di (Bi(«), Bil fi(x)), Bil f?(@)),-.-, Bite "(a))). 


Prove a statement that is analogous to Theorem 8.4. In particular, prove 
that G is computationally perfect. 

Example: 

Taking the Square one-way permutation 2 +> x? (x € QRn, with na 
product of distinct primes) and the |log,(|n|)| least-significant bits, we 
get the generalized Blum-Blum-Shub generator. It is used in the Blum- 
Goldwasser probabilistic encryption scheme (see Chapter 9). 
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This chapter deals with provable security. It is desirable that mathematical 
proofs show that a given cryptosystem resists certain types of attacks. The 
security of cryptographic schemes and randomness are closely related. An en- 
cryption method provides secrecy only if the ciphertexts appear sufficiently 
random to the adversary. Therefore, probabilistic encryption algorithms are 
required. The pioneering work of Shannon on provable security, based on his 
information theory, is discussed in Section 9.1. For example, we prove that 
Vernam’s one-time pad is a perfectly secret encryption. Shannon’s notion of 
perfect secrecy may be interpreted in terms of probabilistic attacking algo- 
rithms that try to distinguish between two candidate plaintexts (Section 9.2). 
Unfortunately, Vernam’s one-time pad is not practical in most situations. In 
Section 9.3, we give important examples of probabilistic encryption algo- 
rithms that are practical. One-way permutations with hard-core predicates 
yield computationally perfect pseudorandom bit generators (Chapter 8), and 
these can be used to define “public-key pseudorandom one-time pads”, by 
analogy to Vernam’s one-time pad: the plaintext bits are XORed with pseu- 
dorandom bits generated from a short, truly random (one-time) seed. More 
recent notions of provable security, which include the computational complex- 
ity of attacking algorithms, are considered in Section 9.4. The computational 
analogue of Shannon’s perfect secrecy, ciphertext-indistinguishability, is de- 
fined. A typical security proof for probabilistic public-key encryption schemes 
is given. We show that the public-key one-time pads, introduced in Section 
9.3, provide computationally perfect secrecy against passive eavesdroppers, 
who perform ciphertext-only or chosen-plaintext attacks. Encryption schemes 
that are secure against adaptively-chosen-ciphertext attacks, are considered 
in Section 9.5. The security proof for Boneh’s SAEP is a typical proof in the 
random oracle model, the proof for Cramer-Shoup’s public key encryption 
scheme is based solely on a standard number-theoretic assumption and the 
collision-resistance of the hash function used. Finally, a short introduction to 
some results of the “unconditional security approach” is given in Section 9.6. 
In this approach, the goal is to design practical cryptosystems which prov- 
ably come close to perfect information-theoretic security, without relying on 
unproven assumptions about problems from computational number theory. 
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9.1 Classical Information-Theoretic Security 


A deterministic public-key encryption algorithm F necessarily leaks infor- 
mation to an adversary. For example, recall the small-message-space attack 
on RSA (Section 3.3.3). An adversary intercepts a ciphertext c and knows 
that the transmitted message m is from a small set {m1,...,m,} of pos- 
sible messages. Then he easily finds out m by computing the ciphertexts 
E(m),...,E(m,) and comparing them with c. This example shows that 
randomness in encryption is necessary to ensure real secrecy. Learning the 
encryption c = E(m) of a message m, an adversary should not be able to 
predict the ciphertext the next time when m is encrypted by E. This obser- 
vation applies also to symmetric-key encryption schemes. Thus, to obtain a 
provably secure encryption scheme, we have to study randomized encryption 
algorithms. 


Definition 9.1. An encryption algorithm EF, which on input m € M out- 
puts a ciphertext c € C, is called a randomized encryption if E is a non- 
deterministic probabilistic algorithm. 


The random behavior of a randomized encryption FE is caused by its coin 
tosses. These coin tosses may be considered as the random choice of a one- 
time key (for each message to be encrypted a new random key is chosen, 
independently of the previous choices). Take, for example, Vernam’s one-time 
pad which is the classical example of a randomized (and provably secure) 
cipher. We recall its definition. 


Definition 9.2. Let n € N and M := C := {0,1}". The randomized en- 
cryption & which encrypts a message m € M by XORing it bitwise with a 
randomly and uniformly chosen bit sequence k “ {0,1}” of the same length, 
E(m) :=m@®k, is called Vernam’s one-time pad. 


As the name indicates, key k is used only once: each time a message 
m is encrypted, a new bit sequence is randomly chosen as the encryption 
key. This choice of the key is viewed as the coin tosses of a probabilistic 
algorithm. The security of a randomized encryption algorithm is related to 
the level of randomness caused by its coin tosses. More randomness means 
more security. Vernam’s one-time pad includes a maximum of randomness 
and hence, provably, provides a maximum of security, as we will see below 
(Theorem 9.5). 

The problem with Vernam’s one-time pad is that truly random keys of the 
same length as the message have to be generated and securely transmitted 
to the recipient. This is rarely a practical operation (for an example, see 
Section 2.1). Later (in Section 9.4), we will see how to obtain practical, but 
still provably secure probabilistic encryption methods, by using high quality 
pseudorandom bit sequences as keys. 

The classical notion of security of an encryption algorithm is based 
on Shannon’s information theory and his famous papers [Shannon48] and 
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[Shannon49]. Appendix B.4 gives an introduction to information theory and 
its basic notions, such as entropy, uncertainty and mutual information. 

We consider a randomized encryption algorithm E mapping plaintexts 
m € M to ciphertexts c € C’. We assume that the messages to be encrypted 
are generated according to some probability distribution, i.e., MJ is assumed 
to be a probability space. The distribution on M and the algorithm F induce 
probability distributions on M x C and C (see Section 5.1). As usual, the 
probability space induced on M x C is denoted by MC and, for m € M 
and c € C, prob(c|m) denotes the probability that c is the ciphertext if 
the plaintext is m. Analogously, prob(m|c) is the probability that m is the 
plaintext if c is the ciphertext.! 

Without loss of generality, we assume that prob(m) > 0 for all m € M 
and that prob(c) > 0 for all ce C. 


Definition 9.3 (Shannon). The encryption E is perfectly secret if C and M 
are independent, i.e., the distribution of MC is the product of the distribu- 
tions on M and C: 


prob(m, c) = prob(m) - prob(c), for all me M,c EC. 
Perfect secrecy can be characterized in different ways. 


Proposition 9.4. The following statements are equivalent: 


E is perfectly secret. 
The mutual information I(M;C) = 0. 
prob(m|c) = prob(m), for allm eM andceC. 


Rows ww 


prob(c|m) = prob(c), for allme M andce C. 

prob(c|m) = prob(c|m’), for allm,m' € M andce C. 
prob(E(m) = c) = prob(c), for allm eM andce C. 

prob(£(m) = c) = prob(E(m’) = c), for allm,m’ € M andceé C; 


i.e., the distribution of E(m) does not depend on m. 


Proof. All statements of Proposition 9.4 are contained in, or immediately 
follow from Proposition B.32. For the latter two statements, observe that 


prob(c|m) = prob(E(m) = c), 


by the definition of prob(E(m) = c) (see Chapter 5). 
Remarks: 


1. The probabilities in statement 7 only depend on the coin tosses of EF. This 
means, in particular, that the perfect secrecy of an encryption algorithm 
F does not depend on the distribution of the plaintexts. 


' The notation is introduced on p. 328 in Appendix B.1. 
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Let Eve be an attacker trying to discover information about the plain- 
texts from the ciphertexts that she is able to intercept. Assume that Eve 
is well informed and knows the distribution of the plaintexts. Then per- 
fect secrecy means that her uncertainty about the plaintext (as precisely 
defined in information theory, see Appendix B.4, Definition B.27) is the 
same whether or not she observes the ciphertext c: learning the ciphertext 
does not increase her information about the plaintext m. Thus, perfect 
secrecy really means unconditional security against ciphertext-only at- 
tacks. 

A perfectly secret randomized encryption FE also withstands the other 
types of attacks, such as the known-plaintext attacks and adaptively- 
chosen-plaintext/ciphertext attacks discussed in Section 1.3. Namely, the 
security is guaranteed by the randomness caused by the coin tosses of E. 
Encrypting, say, r messages, means applying EF r times. The coin tosses 
within one of these executions of E are independent of the coin tosses in 
the other executions (in Vernam’s one-time pad, this corresponds to the 
fact that an individual key is chosen independently for each message). 
Knowing details about previous encryptions does not help the adver- 
sary. Each encryption is a new and independent random experiment and, 
hence, the probabilities prob(c|m) are the same, whether we take them 
conditional on other plaintext-ciphertext pairs (m’,c’) or not. Note that 
additional knowledge of the adversary is included by conditioning the 
probabilities on this knowledge. 

The mutual information is a typical measure defined in information the- 
ory (see Definition B.30). It measures the average amount of information 
Eve obtains about the plaintext m when learning the ciphertext c. 


Vernam’s one-time pad is a perfectly secret encryption. More generally, 


we prove the following theorem. 


Theorem 9.5 (Shannon). Let M := C := K := {0,1}", and let E be a 
one-time pad, which encrypts m := (m1,...,Mn) € M by XORing it with a 
random key string k := (ki,...,kn) € K, chosen independently from m: 


E(m) :=m@k:= (m1, @Oki,...,Mn O kn). 


Then E is perfectly secret if and only if K is uniformly distributed. 


Proof. We have 


probyyo(m, c) = probyy«(m,m @ c) = probyy(m) - probk(m @ c). 


If M and C are independent, then 


probjy(m) - probe (c) = probaso(m, c) = probjy(m) - probk(m @ c). 


Hence 
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probe (m @c) = probo(c), for allme M. 


This means that prob; (k) is the same for all k € K. Thus, K is uniformly 
distributed. Conversely, if K is uniformly distributed, then 


probo(c) = S- probyre(m,m © c) = S- probj;(m) - prob (m & c) 


meM meM 
1 
a ba prob yy(7m) - Qn 
meM 
il 
= oi 


Hence, C is also distributed uniformly, and we obtain: 
probyc(m, ¢) = proby«(m,m @ c) 
1 
= proba,(m) - proby(m @ c) = probyy(m) - 5 


= proba,(m) - probg(c). 


Thus, MM and C are independent. 
Remarks: 


1. Note that we do not consider the one-time pad as a cipher for plaintexts 
of varying length: we have to assume that all plaintexts have the same 
length n. Otherwise some information, namely the length of the plaintext, 
leaks to adversary Eve, and the encryption could not be perfectly secret. 

2. There is a high price to pay for the perfect secrecy of Vernam’s one-time 
pad. For each message to be encrypted, of length n, n independent ran- 
dom bits have to be chosen for the key. One might hope to find a more 
sophisticated, perfectly secret encryption method requiring less random- 
ness. Unfortunately, this hope is destroyed by the following result which 
was proven by Shannon ([Shannon49]). 


Theorem 9.6. Let E be a randomized encryption algorithm with the de- 
terministic extension Ey: M x K —>C. Each time a message m € M is 
encrypted, a one-time key k is chosen randomly from K (according to some 
probability distribution on K), independently from the choice of m. Assume 
that the plaintert m can be recovered from the ciphertext c and the one-time 
key k (no other information is necessary for decryption). Then, if E is per- 
fectly secret, the uncertainty of the keys cannot be smaller than the uncer- 
tainty of the messages: 


Remark. The uncertainty of a probability space M (see Definition B.27) is 
maximal and equal to log2(|M]|) if the distribution of M is uniform (Propo- 
sition B.28). Hence, if M = {0,1}" as in Theorem 9.5, then the entropy of 
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any key set K — yielding a perfectly secret encryption — is at least n. Thus, 
the random choice of k € K requires the choice of at least n truly random 
bits. 

Note at this point that the perfect secrecy of an encryption does not de- 
pend on the distribution of the plaintexts (Proposition 9.4). Therefore, we 
may assume that M is uniformly distributed and, as a consequence, that 
H(M) =n. 

Proof. The plaintext m can be recovered from the ciphertext c and the one- 
time key k. This means that there is no uncertainty about the plaintext 
if both the ciphertext and the key are known, i.e., the conditional entropy 
H(M|KC) = 0 (see Definition B.30). Perfect secrecy means I(M;C) = 
0 (Proposition 9.4), or equivalently, H(C) = H(C|M) (Proposition B.32). 
Since M and K are assumed to be independent, I[(k;M) = I(M;K) = 0 
(Proposition B.32). We compute by use of Proposition B.31 and Definition 
B.33 the following: 


H(K) — H(M) = I(K;M) + H(K|M) —1(M; K) — H(M|K) 
= H(K|M) — H(M|K) 
= I(K;C|M) + H(K|CM) — I(M;C|K) — H(M|KC) 
= I(K;C|M) + H(K|CM) — I(M;C|K) 
2 I(K;C|M) — 1(M;C|k) 
= H(C|M) — H(C| KM) — H(C|K) + H(C|KM) 
= H(C|M) — H(C|K) 
= H(C) — H(C) + 1(K;C) =1(K;C) 
>0. 


The proof of Theorem 9.6 is finished. 


Remark. In Vernam’s one-time pad it is not possible, without destroying 
perfect secrecy, to use the same randomly chosen key for the encryption of two 
messages. This immediately follows, for example, from Theorem 9.6. Namely, 
such a modified Vernam one-time pad may be described as a probabilistic 
algorithm from M x M to C x C, with the deterministic extension 


MxMxK—CxC, (m,m',k)— (m@k,m' @k), 


where M = K = C = {0,1}”. Assuming the uniform distribution on M, we 
have 
H(K) =n < H(M x M) = 2n. 


9.2 Perfect Secrecy and Probabilistic Attacks 


We model the behavior of an adversary Eve by probabilistic algorithms, and 
show the relation between the failure of such algorithms and perfect secrecy. 
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In Section 9.3, we will slightly modify this model by restricting the computing 
power of the adversary to polynomial resources. 

As in Section 9.1, let E be a randomized encryption algorithm that maps 
plaintexts m € M to ciphertexts c € C and is used by Alice to encrypt her 
messages. As before, Alice chooses the messages m € M according to some 
probability distribution. The distribution on M and the algorithm EF induce 
probability distributions on M x C and C. prob(m, c) is the probability that 
m. is the chosen message and that the probabilistic encryption of m yields c. 

We first consider a probabilistic algorithm A which on input c € C' out- 
puts a plaintext m € M. Algorithm A models an adversary Eve performing 
a ciphertext-only attack and trying to decrypt ciphertexts. Recall that the 
coin tosses of a probabilistic algorithm are independent of any other random 
events in the given setting (see Chapter 5). Thus, the coin tosses of A are 
independent of the choice of the message and the coin tosses of EF. This is a 
reasonable model, because sender Alice, generating and encrypting messages, 
and adversary Eve operate independently. We have 


prob(m, c, A(c) = m) = prob(m, c) - prob(A(c) = m), 


for m € M and c € C (see Chapter 5). prob(A(c) = m) is the conditional 
probability that A(c) yields m, assuming that m and c are fixed. It is de- 
termined by the coin tosses of A. The probability of success of A is given 
by 


PLODgiccess(A) = SS prob(m, c) - prob(A(c) = m) 


m,c 


S- prob(m) - prob(E(m) = c) - prob(A(c) = m) 


m,c 


= prob(A(c) = m:m<— M,c«— E(m)). 


I 


Proposition 9.7. If E is perfectly secret, then for every probabilistic algo- 
rithm A which on input c € C outputs a plaintert m € M 


success ( 


prob A)< max prob(m). 
me 
Proof. 


PLO scenes A) ai ye prob(m, c) ° prob(A(c) i, m) 


m,c 


= a prob(c) - S- prob(m|c) - prob(A(c) = m) 


= S- prob(c) - S- prob(m) - prob(A(c) = m) (by Proposition 9.4) 
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< max prob(m) - d. prob(c) - d prob(A(c) = m) 
= max prob(m), 


and the proposition follows. 
Remarks: 


1. In Proposition 9.7, as in the whole of Section 9.2, we do not assume 
any limits for the resources of the algorithms. The running time and the 
memory requirements may be exponential. 

2. Proposition 9.7 says that for a perfectly secret encryption, selecting a 
plaintext with maximal probability from M, without looking at the ci- 
phertext, is optimum under all attacks that try to derive the plaintext 
from the ciphertext. If M is uniformly distributed, then randomly select- 
ing a plaintext is an optimal strategy. 


Perfect secrecy may also be described in terms of distinguishing algo- 
rithms. 


Definition 9.8. A distinguishing algorithm for E is a probabilistic algorithm 
A which on inputs mg,m, € M and c € C outputs an m € {mo, my}. 


Remark. A distinguishing algorithm A models an adversary Eve, who, given 
a ciphertext c and two plaintext candidates mp and mz, tries to find out 
which one of the both is the correct plaintext, i.e., which one is encrypted as 
c. Again, recall that the coin tosses of A are independent of a random choice 
of the messages and the coin tosses of the encryption algorithm (see Chapter 
5). Thus, the adversary Eve and the sender Alice, generating and encrypting 
messages, are modeled as working independently. 


Proposition 9.9. F is perfectly secret if and only if for every probabilistic 
distinguishing algorithm A and all mp,m, € M, 


prob(A(mo, m1, c) = mo : c— E(mo)) 


= prob(A(mo, m1, c) = mo : c— E(m)). 


Proof. E is perfectly secret if and only if the distribution of E(m) does not 
depend on m (Proposition 9.4). Thus, the equality obviously holds if E is 
perfectly secret. 

Conversely, assume that F is not perfectly secret. There are no limits for 
the running time of our algorithms. Then there is an algorithm P which starts 
with a description of the encryption algorithm EF and analyzes the paths and 
coin tosses of F’ and, in this way, computes the probabilities prob(c|m): 


P(c,m) := prob(c|m), for allc€ Cyme M. 


We define the following distinguishing algorithm: 
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if P P 
A(mo,™1, ¢) := Ce if P(c,mo) > P(c,m1), 


my otherwise. 


Since F is not perfectly secret, there are mo,m, € M and co € C, such that 
P(c9,™Mo) = prob(cp|mo) > P(co, m1) = prob(cg|m1) (Proposition 9.4). Let 


Co := {c € C | prob(c| mo) > prob(c|m,)} and 


C, :={c€C | prob(c|mo) < prob(c|mz)}. 
Then A(mo,™m1,¢c) = mo for c € Co, and A(mo,m1,c) = m, for c € Cy. We 
compute 
prob(A(mo, m1, c) = m9: ¢ — E(mp)) 


— prob(A(mo, m1, ¢) = mo : ¢— E(m)) 
= S- prob(c|mo) - prob( A(mo, m1, c) = mo) 


CEC 
— S© prob(e|m) « prob(A(mo, m1, ¢) = mo) 
cE 
= S- prob(c|mo) — prob(c|m1) 
c€Co 


> prob(co|mo) — prob(co |m1) 
> 0, 


and see that a violation of perfect secrecy causes a violation of the equality 
condition. The proof of the proposition is finished. 


Proposition 9.10. E is perfectly secret if and only if for every probabilistic 
distinguishing algorithm A and all mo,m, € M, with mp 4 m1, 


u 1 
prob(A(mpo, m4, c) = m:m — {mp,m,},c — E(m)) = 3° 
Proof. 


prob(A(mo,m1,c) =m:m eo {mo,m},c — E(m)) 


1 
= - prob(A(mpo, m1, c) = mo : c — E(mo)) 
1 
it 5) - prob(A(mpo, m1, ¢) = my : ¢ — E(m;)) 


- (prob(A(mg, m1, c) = mg : c — E(mo)) 


Nl eR 


+ 


Nl re 


I 


— prob(A(mo, m1, ¢) = mo: ¢ — E(m))), 


and the proposition follows from Proposition 9.9. 
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Remark. Proposition 9.10 characterizes a perfectly secret encryption scheme 
in terms of a passive eavesdropper A, who performs a ciphertext-only attack. 
But, as we observed before, the statement would remain true, if we model 
an (adaptively-)chosen-plaintext /ciphertext attacker by algorithm A (see the 
remark after Proposition 9.4). 


9.3 Public-Key One-Time Pads 


Vernam’s one-time pad is provably secure (Section 9.1) and thus appears to 
be a very attractive encryption method. However, there is the problem that 
truly random keys of the same length as the message have to be generated 
and securely transmitted to the recipient. The idea now is to use high qual- 
ity (“cryptographically secure”) pseudorandom bit sequences as keys and to 
obtain in this way practical, but still provably secure randomized encryption 
methods. 


Definition 9.11. Let I = (Ip)zen be a key set with security parameter 
k, and let G = (Gi)icr, Gi: Xi —> {0,1} (4 € I), be a pseudorandom 
bit generator with polynomial stretch function | and key generator K (see 
Definition 8.2). 

The probabilistic polynomial encryption algorithm E(i,m) which, given 
(a public key) i € I,, encrypts a message m € {0,1}!“) by bitwise XORing 
it with the pseudorandom sequence G';(x), generated by G; from a randomly 
and uniformly chosen seed x € X;, 


E(i,m) :=m@Gi(x), «£4 Xi, 


is called the pseudorandom one-time pad induced by G. Keys i are assumed 
to be generated by K. 


Example. Let f = (fi: Dj —> Di)ier be a family of one-way permutations 
with hard-core predicate B = (B; : D; —> {0,1})ier, and let Q be a poly- 
nomial. f,B and Q induce a pseudorandom bit generator G(f,B,Q) with 
stretch function Q (see Definition 8.3), and hence a pseudorandom one-time 
pad. 


We will see in Section 9.4 that computationally perfect pseudorandom 
generators (Definition 8.2), such as the G(f, B,Q)s, lead to provably secure 
encryption schemes. Nevertheless, one important problem remains unsolved: 
how to transmit the secret one-time key — the randomly chosen seed x — to 
the recipient of the message? 

If G is induced by a family of trapdoor permutations with hard-core pred- 
icate, there is an easy answer. We send x hidden by the one-way function 
together with the encrypted message. Knowing the trapdoor, the recipient is 
able to determine zx. 
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Definition 9.12. Let J = (Ix)xen be a key set with security parameter k, 
and let Q be a positive polynomial. Let f = (f; : D; —> Di)ier be a family of 
trapdoor permutations with hard-core predicate B and key generator Kk. Let 
G(f, B,Q) be the induced pseudorandom bit generator with stretch function 
Q. For every recipient of messages, a public key i € J, (and the associated 
trapdoor information) is generated by K. 

The probabilistic polynomial encryption algorithm E(i,m) which en- 
crypts a message m € {0,1}9) as 


E(i,m) := (m@ G(f, B, Q):(2), FP (a), 


with « chosen randomly and uniformly from D; (for each message m), is 
called the public-key one-time pad induced by f, B and Q. 


Remarks: 


1. Recall that we get G(f,B,Q),(x) by repeatedly applying f; to x and 
taking the hard-core bits B;(f/(a)) of the sequence 


2, fila), f2(2), F(a)... fr” (@) 


(see Definition 8.3). In order to encrypt the seed x, we then apply f; once 
more. Note that we cannot take f?(x) with 7 < Q(k) as the encryption 
of x, because this would reveal bits from the sequence G(f, B, Q);(«). 
2. Since f; is a permutation of D; and the recipient Bob knows the trapdoor, 
he has an efficient algorithm for ite He is able to compute the sequence 


2, fila), f2(2), F(a), fr” '(@) 


from f- (x), by repeatedly applying f; +. In this way, he can easily 
decrypt the ciphertext. 

3. In the public-key one-time pad, the basic pseudorandom one-time pad is 
augmented by an asymmetric (i.e., public-key) way of transmitting the 
one-time symmetric encryption key «x. 

4. Like the basic pseudorandom one-time pad, the augmented version is 
provably secure against passive attacks (see Theorem 9.16). Supplying the 
encrypted key FE (x) does not diminish the secrecy of the encryption 
scheme. 

5. Pseudorandom one-time pads and public-key one-time pads are straight- 
forward analogies to the classical probabilistic encryption method, Ver- 
nam’s one-time pad. We will see in Section 9.4 that more recent notions of 
secrecy, such as indistinguishability (introduced in [GolMic84]), are also 
analogous to the classical notion in Shannon’s work. The statements on 
the secrecy of pseudorandom and public-key one-time pads (see Section 
9.4) are analogous to the classical results by Shannon. 
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6. The notion of probabilistic public-key encryption, whose security may 
be rigorously proven in a complexity theoretic model, was suggested 
by Goldwasser and Micali ([GolMic84]). They introduced the hard-core 
predicates of trapdoor functions (or, more generally, trapdoor predicates) 
as the basic building blocks of such schemes. The implementation of 
probabilistic public-key encryption, given in [GolMic84] and known as 
Goldwasser-Micali probabilistic encryption (see Exercise 7), is based on 
the quadratic residuosity assumption (Definition 6.11). During encryp- 
tion, messages are expanded by a factor proportional to the security 
parameter k. Thus, this implementation is quite wasteful in space and 
bandwidth and is therefore not really practical. The public-key one-time 
pads, introduced in [BluGol85] and [BluBluShu86], avoid this large mes- 
sage expansion. They are the efficient implementations of (asymmetric) 
probabilistic encryption. 


9.4 Passive Eavesdroppers 


We study the security of public-key encryption schemes against passive eaves- 
droppers, who perform ciphertext-only attacks. In a public-key encryption 
scheme, a ciphertext-only attacker (as everybody) can encrypt messages of 
his choice at any time by using the publicly known key. Therefore, secu- 
rity against ciphertext-only attacks in a public-key encryption scheme also 
includes security against adaptively-chosen-plaintext attacks. 

The stronger notion of security against adaptively-chosen-ciphertext at- 
tacks is considered in the subsequent Section 9.5. 

Throughout this section we consider a probabilistic polynomial encryption 
algorithm E(i,m), such as the pseudorandom or public-key one-time pads 
defined in Section 9.3. Here, J = (J,)xen isa key set with security parameter k 
and, for every 7 € I, F maps plaintexts m € M; to ciphertexts c:= E(i,m) € 
C;. The keys 7 are generated by a probabilistic polynomial algorithm K and 
are assumed to be public. In our examples, the encryption F is derived from 
a family f = (fi)ier of one-way permutations, and the index i is the public 
key of the recipient. 

We define distinguishing algorithms A for E completely analogous to Def- 
inition 9.8. Now the computational resources of the adversary, modeled by 
A, are limited. A is required to be polynomial. 


Definition 9.13. A probabilistic polynomial distinguishing algorithm for E 
is a probabilistic polynomial algorithm A(i,mo,m ,,c) which on inputs i € 
I,mo,m, € M; and c € C; outputs an m € {mo,mj}. 


Below we show that pseudorandom one-time pads induced by computa- 
tionally perfect pseudorandom bit generators have computationally perfect 
secrecy. This result is analogous to the classical result by Shannon that Ver- 
nam’s one-time pad is perfectly secret. Using truly random bit sequences as 
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the key in the one-time pad, the probability of success of an attack with 
unlimited resources — which tries to distinguish between two candidate plain- 
texts — is equal to 1/2; so there is no use in observing the ciphertext. Using 
computationally perfect pseudorandom bit sequences, the probability of suc- 
cess of an attack with polynomial resources is, at most, negligibly more than 
Vp. 


Definition 9.14. The encryption F is called ciphertext-indistinguishable or 
(for short) indistinguishable, if for every probabilistic polynomial distinguish- 
ing algorithm A(i,mo,mj,c) and every probabilistic polynomial sampling al- 
gorithm $, which on input i € I yields S(i) = {mo,m1} C Mj, and every 
positive polynomial P € Z[X], there is a kg € N such that for all k > kg: 


prob(A(i,mo,m1,c) =m:i— K(1*), {mo,mi} — S(A), 
1 
fe 


(k) 


m « {mo,m1},c¢ — E(i,m)) < 


Nile 


Remarks: 


1. The definition is a definition in the “public-key model”: the keys i are 
public and hence available to the distinguishing algorithms. It can be 
adapted to a private-key setting. See the analogous remark after the 
definition of pseudorandom generators (Definition 8.2). 

2. Algorithm A models a passive adversary, who performs a ciphertext-only 
attack. But, everybody knows the public key 7 and can encrypt messages 
of his choice at any time. This implies that the adversary algorithm 
A may include the encryption of messages of its choice. We see that 
ciphertext-indistinguishability, as defined here, means security against 
adaptively-chosen-plaintext attacks. Chosen-ciphertext attacks are con- 
sidered in Section 9.5. 

3. The output of the sampling algorithm S' is a subset {mo,m1} with two 
members, and therefore mp 4 m4. 

If the message spaces M; are unrestricted, like {0,1}*, then it is usually 
required that the two candidate plaintexts mog,m 1, generated by S, have 
the same bit length. This additional requirement is reasonable. Typically, 
the length of the plaintext and the length of the ciphertext are closely 
related. Hence, information about the length of the plaintexts necessarily 
leaks to an adversary, and plaintexts of different length can easily be 
distinguished. For the same reason, we considered Vernam’s one-time 
pad as a cipher for plaintexts of a fixed length in Section 9.1 (see the 
remark after Theorem 9.5). 

In this section, we consider only schemes where all plaintexts are of the 
same bit length. 

4. In view of Proposition 9.10, the definition is the computational analogy 
to the notion of perfect secrecy, as defined by Shannon. 
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Non-perfect secrecy means that some algorithm A is able to distinguish 
between distinct plaintexts mo and m, (given the ciphertext) with a prob- 
ability > 1/2. The running time of A may be exponential. If an encryption 
scheme is not ciphertext-indistinguishable, then an algorithm with poly- 
nomial running time is able to distinguish between distinct plaintexts mo 
and m (given the ciphertext) with a probability significantly larger than 
1/9. In addition, the plaintexts mp and m, can be found in polynomial 
time by some probabilistic algorithm S. This additional requirement is 
adequate. A secrecy problem can only exist for messages which can be 
generated in practice by using a probabilistic polynomial algorithm. The 
message generation is modeled uniformly by S$ for all keys i.? 

The notion of ciphertext-indistinguishability was introduced by Gold- 
wasser and Micali ([GolMic84]). They call it polynomial security or 
polynomial-time indistinguishability. Ciphertext-indistinguishable encryp- 
tion schemes are also called schemes with indistinguishable encryptions. 

Another notion of security was introduced in [GolMic84]. An encryp- 
tion scheme is called semantically secure, if it has the following prop- 
erty: Whatever a passive adversary Eve is able to compute about the 
plaintext in polynomial time given the ciphertext, she is also able to 
compute in polynomial time without the ciphertext. The messages to 
be encrypted are assumed to be generated by a probabilistic polyno- 
mial algorithm. Semantic security is equivalent to indistinguishability 
([GolMic84]; [MicRacSlo88]; [WatShilma03]; [Goldreich04]). 

Recall that the execution of a probabilistic algorithm is an independent 
random experiment (see Chapter 5). Thus, the coin tosses of the distin- 
guishing algorithm A are independent of the coin tosses of the sampling 
algorithm S and the coin tosses of the encryption algorithm E. This 
reflects the fact that sender and adversary operate independently. 

The probability in our present definition is also taken over the random 
generation of a key i, with a given security parameter k. Even for very 
large k, there may be insecure keys 7 such that A is able to distinguish 
successfully between two plaintext candidates. However, when randomly 
generating keys by the key generator, the probability of obtaining an 
insecure one is negligibly small (see Proposition 6.17 for a precise state- 
ment). 

Ciphertext-indistinguishable encryption algorithms are necessarily ran- 
domized encryptions. When encrypting a plaintext m twice, the proba- 
bility that we get the same ciphertext must be negligibly small. Other- 
wise it would be easy to distinguish between two messages mo and m1 

by comparing the ciphertext with encryptions of mp and m4. 


? By applying a (more general) non-uniform model of computation (non-uniform 


polynomial-time algorithms instead of probabilistic polynomial algorithms, see, 
for example, [Goldreich99], [Goldreich04]), one can dispense with the sampling 
algorithm S. 
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Theorem 9.15. Let E be the pseudorandom one-time pad induced by a com- 
putationally perfect pseudorandom bit generator G. Then E is cipherteat- 
indistinguishable. 


Proof. The proof runs in exactly the same way as the proof of Theorem 9.16, 
yielding a contradiction to the assumption that G is computationally perfect. 
See below for the details. 


General pseudorandom one-time pads leave open how the secret encryp- 
tion key — the randomly chosen seed — is securely transmitted to the receiver 
of the message. Public-key one-time pads provide an answer. The key is en- 
crypted by the underlying one-way permutation and becomes part of the en- 
crypted message (see Definition 9.12). The indistinguishability is preserved. 


Theorem 9.16. Let E be the public-key one-time pad induced by a family 
f =(fi: Di — Di)icr of trapdoor permutations with hard-core predicate B. 
Then E is ciphertext-indistinguishable. 


Remark. Theorem 9.16 states that public-key cryptography provides vari- 
ants of the one-time pad which are provably secure and practical. XORing 
the plaintext with a pseudorandom bit sequence generated from a short ran- 
dom seed by a trapdoor permutation with hard-core predicate (e.g. use the 
Blum-Blum-Shub generator or the RSA generator, see Section 8.1,) yields 
an encryption with indistinguishability. Given the ciphertext, an adversary 
is provably not able to distinguish between two plaintexts. In addition, it is 
possible in this public-key one-time pad to securely transmit the key string 
(more precisely, the seed of the pseudorandom sequence) to the recipient, 
simply by encrypting it by means of the one-way function. 

Of course, the security proof for a public-key one-time pad, such as the 
RSA- or Blum-Blum-Shub-based one-time pad, is conditional. It depends on 
the validity of basic unproven (though widely believed) assumptions, such as 
the RSA assumption (Definition 6.7), or the factoring assumption (Definition 
6.9). 

Computing the pseudorandom bit sequences using a one-way permutation 
requires complex computations, such as exponentiation and modular reduc- 
tions. Thus, the classical private-key symmetric encryption methods, like the 
DES (see Chapter 2) or stream ciphers, using shift registers to generate pseu- 
dorandom sequences (see, e.g., [MenOorVan96], Chapter 6), are much more 
efficient than public-key one-time pads, and hence are better suited for large 
amounts of data. 

However, notice that the one-way function of the Blum-Blum-Shub gen- 
erator (see Chapter 8) is a quite simple one. Quadratic residues « mod n 
are squared: x ++ 2? mod n. A public-key one-time pad, whose efficiency is 
comparable to standard RSA encryption, can be implemented based on this 
generator. 

Namely, suppose n = pq, with distinct primes p,q = 3 mod 4 of binary 
length k. Messages m of length / are to be encrypted. In order to encrypt m, 
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we randomly choose an element from Z* and square it to get a random « in 
QR,,,. This requires O(k?) steps. To get the pseudorandom sequence and to 
encrypt the random seed x we have to compute / squares modulo n, which 
comes out to O(k71) steps. XORing requires O(1) steps. Thus, encryption is 
finished in O(k?1) steps. 

To decrypt, we first compute the seed x from x by drawing the square 
root / times in QR,,. We can do this by drawing the square roots modulo p and 
modulo q, and applying the Chinese Remainder Theorem (see Proposition 
A.62). 

Assume, in addition, that we have even chosen p,q = 7 mod 8. Then 
p +1/g is in N, and for every quadratic residue a € QR, we get a square root 
b which again is an element of QR,, by setting 


b = geth/4 = (alps) " 


Here, note that 
b= gt D/? = q. gh"? =a, 


since a(?—))/2 — (2) = 1 for quadratic residues a € QR,, (Proposition A.52). 


Thus, we can derive x mod p from y = re by 
1 
1 
zx mod p=y" mod p, with u = (Ce) mod (p — 1). 


The exponent wu can be reduced modulo p — 1, since a?~! = 1 for all a € Zi. 
We assume that the message length / is fixed. Then the exponent wu can 
be computed in advance, and we see that figuring out x mod p (or « mod 
q) requires at most & squarings applying the repeated squaring algorithm 
(Algorithm A.26). Thus, it can be done in O(k?) steps. 

Reducing x? mod p (and x? mod q) at the beginning and applying the 
Chinese Remainder Theorem at the end requires at most O(k?) steps. Sum- 
marizing, we see that computing x requires O(k) steps. Now, completing the 
decryption essentially means performing an encryption whose cost is O(k?/), 
as we saw above. Hence, the complete decryption procedure takes O(k? + k71) 
steps. If 1 = O(k), this is equal to O(k?) and thus of the same order as the 
running time of an RSA encryption. 

The efficiency of the Blum-Blum-Shub-based public-key one-time pad (as 
well as that of the RSA-based one) can be increased even further, by modify- 
ing the generation of the pseudorandom bit sequence. Instead of taking only 
the least-significant bit of 2?’ mod n, you may take the |log,(|n|)| least- 
significant bits after each squaring. These bits form a |log,(|n|)|-bit hard- 
core predicate of the modular squaring function and are simultaneously secure 
(see Exercise 7 in Chapter 8). The resulting public-key one-time pad is called 
Blum-Goldwasser probabilistic encryption ([BluGol85]). It is also ciphertext- 
indistinguishable (see Exercise 8 in Chapter 8). 


9.4 Passive Eavesdroppers 231 


Our considerations are not only valid asymptotically, as the O notation 
might suggest. Take for example / = 512 and |n| = 1024, and encrypt 1024- 
bit messages m. In the x? mod n public-key one-time pad, always use the 
logs(|n|) = 10 least-significant bits. To encrypt a message m, about 100 mod- 
ular squarings of 1024-bit numbers are necessary. To decrypt a ciphertext, 
we first determine the seed x by at most 1024 = 512+512 modular squarings 
and multiplications of 512-bit numbers, and then compute the plaintext using 
about 100 modular squarings, as in encryption. Encrypting and decrypting 
messages m € Z,, by RSA requires up to 1024 modular squarings and mul- 
tiplications of 1024-bit (encryption) or 512-bit (decryption) numbers, with 
the actual number depending on the size of the encryption and decryption 
exponents. In the estimates, we did not count the (few) operations associated 
with applying the Chinese Remainder Theorem during decryption. 


Proof (of Theorem 9.16). Let K be the key generator of f and G := 
G(f, B,Q) be the pseudorandom bit generator (Chapter 8). Recall that 


E(i,m) = (m® Gi(2), f° (2), 


where i € I = (Iz) ren, m € {0,1}%™) (for i € I,) and the seed x is randomly 
and uniformly chosen from Dj. 

The image of the uniform distribution on D; under few 
uniform distribution on D;, because f; is bijective. 

You can obtain a proof of Theorem 9.15 simply by omitting “xD,”, 
Y; fo (x) everywhere (and by replacing Q(k) by [(k)). Our proof yields 
a contradiction to Theorem 8.4. In the proof of Theorem 9.15, you get the 
completely analogous contradiction to the assumption that the pseudoran- 
dom bit generator G is computationally perfect. 

Now assume that there is a probabilistic polynomial distinguishing algo- 
rithm A(i,mo,m1,c,y), with inputs i € I,mo,m, € Mj,c € {0,1}@ (if 
i € Ik), y € Dj, a probabilistic polynomial sampling algorithm S$(i) and a 
positive polynomial P, such that 


is again the 


prob(A(i,mo,™m1,¢,y) =m: 
i— K(1"), {mo, m1} — S(i),m = {mo, m4}, (c,y) — E(i,m)) 
= prob(A(i,mp,m1,m @ G(x), fe (2)) =m: 
i K(1"), {mo, m1} — S(i),m = {mo,m1}, 2 & Dj) 
1 


2B) 


af 
2 


for infinitely many k. We define a probabilistic polynomial statistical test 
A = A(i,z,y), with inputs i € I,z € {0,1} (if i © Ih) and y € D; and 
output in {0, 1}: 


1. Apply S(2) and get {mo, m1} := S(2). 
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2. Randomly choose m in {mo,mi}:m“ {mo, mz}. 
3. Let 


A(i, z,y) = . if A(i,mo,m1,m ® z,y) =™M, 


0 otherwise. 


The statistical test A will be able to distinguish between the pseudorandom 
sequences produced by G and truly random, uniformly distributed sequences, 
thus yielding the desired contradiction. We have to compute the probability 


prob(A(i, z,y) =1:i — K(1*),(z,y) — {0,1}2™ x D,) 
for both the uniform distribution on {0,1}? x D; and the distribution 
induced by (G, f@“)). More precisely, we have to compare the probabilities 


pec = prob(A(i, G(x), fe (x) =1:i-— K(1"),2 & D;) and 


Pk,uni += prob(A(i, z,y) = 1:i— K(1*),z = {0,1}, y & Dj). 
Our goal is to prove that 
Pk,G — Pk,uni > P(k)’ 
for infinitely many k. This contradicts Theorem 8.4 and finishes the proof. 
From the definition of A we get 
prob(A(i, z,y) =1:4i— K(1*),(z,y) — {0,1}°™ x D,) 
= prob(A(i,mo,m1,m @ z,y) =m:i- K(1*), 
(z,y) a {0, 1} 2) x Di, {mo, m1} a S(i), m a {mo,™y}). 
Since the random choice of (z,y) and the random choice of m in the proba- 
bilistically computed pair $(z) are independent, we may switch them in the 
probability and obtain 
prob(A(i, z,y) = 1:1 — K(1*), (2,4) — {0,1}2 x Di) 
= prob(A(i, M9, 721,77 @ z,y) =mMit— K(1*), 
{mo, my} cs S(i),m a {mo, m1}, (z,y) a {0, Tee x Dj). 


Now consider pyc: 


pk,g = prob(A(i,mo,mi,m® Gila), fe («)) =m: 
te K(1*), {mo, m1} — S(t),m & {mo, m1}, x & Dj). 
We assumed that this probability is > 1/2 + 1/p(x), for infinitely many k. 


The following computation shows that pruni = 1/2 for all k, thus com- 
pleting the proof. We have 
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Pkuni = prob(A(i, mo,7m1,M @ z,y) = Mic Ki"); {mo, m1} — S(t), 
m {mo,mi},z — {0,1}? ,y & Dy) 


= >_ prob(K(1") =i)- oy prob(.S(z) = {mo,m1}) + Pimms 


mo,™my 
with 


Pi,mo,mi = prob(A(i, mo,™M1,M © Zz, y) =m: 
m * {mo,m},z = {0,1}9™, y & Dj). 
Vernam’s one-time pad is perfectly secret (Theorem 9.5). Thus, the latter 
probability is equal to 1/2 by Proposition 9.10. Note that the additional input 


y — D, of A, not appearing in Proposition 9.10, can be viewed as additional 
coin tosses of A. So, we finally get the desired equation 


Pkami = Y_/prob((L") = 4) d= prob($(i) = {mo,m1}) - 


mo,m1 


and the proof of the theorem is finished. 


9.5 Chosen-Ciphertext Attacks 


In the preceding section, we studied encryption schemes which are ciphertext- 
indistinguishable and provide security against a passive eavesdropper, who 
performs a ciphertext-only or an adaptively-chosen-plaintext attack. The 
schemes may still be insecure against an attacker, who manages to get tem- 
porary access to the decryption device and who executes a chosen-ciphertext 
or an adaptively-chosen-ciphertext attack (see Section 1.3 for a classification 
of attacks). Given a ciphertext c, such an attacker Eve tries to get informa- 
tion about the plaintext. In the course of her attack, Eve can get decryptions 
of ciphertexts c’ from the decryption device, with the only restriction that 
c' #c. She has temporary access to a “decryption oracle”. 

Consider, for example, the efficient implementation of Goldwasser-Micali’s 
probabilistic encryption, which we called public-key one-time pad and which 
we studied in the preceding sections. A public-key one-time pad encrypts a 
message m by XORing it bitwise with a pseudorandom key stream G(x), 
where x is a secret random seed. The pseudorandom bit generator G is in- 
duced by a family f = (fi: Di —> Di)ier of trapdoor permutations with 
hard-core predicate B. The public-key-encrypted seed f!(x) (where I is the 
bit length of the messages m) is transmitted together with the encrypted 
message m © G(x) (see Section 9.4 above). 

These encryption schemes are provably secure against passive eavesdrop- 
pers (Theorem 9.16), but they are insecure against a chosen-ciphertext at- 
tacker Eve. Eve submits ciphertexts (c,y) for decryption, where y is any 
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element in the domain of f and c = cicg...c is any bit string of length 
l. If Eve only obtains the last bit m; of the plaintext m = mjm2...m 
from the decryption device, then she immediately derives the hard-core bit 
B(f~*(y)) of f7*(y), since B(f~!(y)) = 2 ® m. Therefore, Eve has an or- 
acle that provides her with the hard-core bit B(f~'(y)) for every y. Now 
assume that f is the RSA function modulo a composite n = pq or the Rabin 
function, which squares quadratic residues modulo n = pq, as in Blum-Blum- 
Shub-encryption, and B is the least significant bit. The hard-core-bit oracle 
enables Eve to compute the inverse of the RSA or Rabin function modulo 
n by using an efficient algorithm, which calls the oracle as a subroutine. We 
constructed these algorithms in Sections 7.2 and 7.3. Then, of course, Eve 
can also compute the seed x from f!(2) and derive the plaintext m = cG(z) 
for every ciphertext c. 

We have to worry about adaptively-chosen-ciphertext attacks. One can 
imagine scenarios where Bob, the owner of the secret decryption key, might 
think that decryption requests are reasonable — for example, if an incomplete 
configuration or control of privileges enables attacker Eve from time to time 
to get access to the decryption device. If a system is secure against chosen- 
ciphertext attacks, then it also resists partial chosen-ciphertext attacks. In 
such an attack, adversary Eve does not get the full plaintext in response to 
her decryption requests, but only some partial information. Partial-chosen- 
ciphertext attacks are a real danger in practice. We just discussed a partial- 
chosen-ciphertext attack against public-key one-time pads. In Section 3.3.3, 
we described Bleichenbacher’s 1-Million-Chosen-Ciphertext Attack against 
PKCS#1(v1.5)-based schemes, which is a practical example of a partial- 
chosen-ciphertext attack. 

Therefore, it is desirable to have encryption schemes which are provably 
secure against adaptively-chosen-ciphertext attacks. We give two examples 
of such schemes. The security proof of the first scheme, Boneh’s SAEP, re- 
lies on the random oracle model, which we described in Section 3.4.5, and 
the factoring assumption (Definition 6.9). The security proof of the second 
scheme, Cramer-Shoup’s public key encryption scheme, is based solely on a 
standard number-theoretic assumption of the hardness of a computational 
problem and on a standard hash function assumption (collision-resistance). 
The stronger random oracle model is not needed. 

We start with a definition of the security notion. The definition includes 
a precise description of the attack model. As before, we strive for ciphertext- 
indistinguishability. This notion can be extended to cover adaptively-chosen- 
ciphertext attacks ([NaoYun90]; [RacSim91)). 


Definition 9.17. Let E be a public-key encryption scheme. 


1. An adaptively-chosen-ciphertezt attack algorithm A against E is a prob- 
abilistic polynomial algorithm that interacts with its environment, called 
the challenger, as follows: 
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a. Setup: The challenger C’ randomly generates a public-secret key pair 
(pk, sk) for E and calls A with the public key pk as input. The secret 
key sk is kept secret. 

b. Phase I: The adversary A issues a sequence of decryption requests 
for various ciphertexts c’. The challenger responds with the decryp- 
tion of the valid ciphertexts c’. 

c. Challenge: At some point, algorithm A outputs two distinct mes- 
sages ™mo,™my1. The challenger selects a message m € {mo,mj,} at 
random and responds with the “challenge” ciphertext c, which is an 
encryption E(pk,m) of m. 

d. Phase II: The adversary A continues to request the decryption of 
ciphertexts c’, with the only constraint that c’ 4 c. The challenger 
decrypts c’, if c’ is valid, and sends the plaintext to A. Finally, A 
terminates and outputs m’ € {mo, my}. 

The attacker A is successful, if m’ = m. 

We also call A, more precisely, an adaptively-chosen-ciphertext distin- 
guishing algorithm. 

E is ciphertext-indistinguishable against adaptively-chosen-ciphertext at- 
tacks, if for every adaptively-chosen-ciphertext distinguishing algorithm 
A, the probability of success is < 1/2 + €, with € negligible. 

We also call such an encryption scheme EF indistinguishability-secure, or 
secure, for short, against adaptively-chosen-ciphertezt attacks. 


Remarks: 


1. 


2: 


In the real attack, the challenger is Bob, the legitimate owner of a public- 
secret key pair (pk, sk). He is attacked by the adversary A. 

If the message space is unrestricted, like {0, 1}*, then it is usually required 
that the two candidate plaintexts mg,mj, have the same bit length. This 
additional requirement is reasonable. Typically, the length of the plain- 
text and the length of the ciphertext are closely related. Hence, informa- 
tion about the length of the plaintexts necessarily leaks to an adversary, 
and plaintexts of different length can easily be distinguished (also see the 
remark on Vernam’s one-time pad after Theorem 9.5). 

If the message space is a number-theoretic or geometric domain, as in 
the typical public-key encryption scheme, then, usually, all messages have 
the same bit length. Take, for example, Z,. The messages m € Z, are all 
encoded as bit strings of length |log,(n)| + 1; they are padded out with 
the appropriate number of leading zeros, if necessary. 

In the literature, adaptively-chosen-ciphertext attacks are sometimes de- 
noted by the acronym CCA2, and sometimes they are simply called 
chosen-ciphertext attacks. Non-adaptive chosen-ciphertext attacks, which 
can request the decryption of ciphertexts only in phase I of the attack, 
are often called lunchtime attacks or midnight attacks and denoted by 
the acronym CCA1. 
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4. The notion of semantic security (see the remarks after Definition 9.14) 
can also be carried over to the adaptively-chosen-ciphertext setting. It 
means that whatever an adversary Eve is able to compute about the 
plaintext m in polynomial time given the ciphertext c, she is also able to 
compute in polynomial time without the ciphertext, even if Eve gets the 
decryption of any adaptively chosen ciphertexts c’ 4 c. As shown recently, 
semantic security and indistinguishability are also equivalent in the case 
of adaptively-chosen-ciphertext attacks ([WatShilma03]; [Goldreich04)). 


9.5.1 A Security Proof in the Random Oracle Model 


Boneh’s Simplified OAEP — SAEP. As an example, we study Boneh’s 
Simple-OAEP encryption, or SAEP for short ([Boneh01]). In the encryption 
scheme, a collision-resistant hash function h is used. The security proof for 
SAEP is a proof in the random oracle model. Basically, this means that h 
is assumed to be a truly random function (see page 66 for a more precise 
description of the random oracle model). 

In contrast to Bellare’s OAEP, which we studied in Section 3.3.4, SAEP 
applies Rabin encryption (Section 3.6.1) and not the RSA function. Com- 
pared to OAEP, the padding scheme is considerably simplified. It requires 
only one cryptographic hash function. The slightly more complex padding 
scheme SAEP-+ is provably secure and can be applied to the RSA and the 
Rabin function (see [Boneh01]). As OAEP, it requires an additional hash 
function G. We do not discuss SAEP+ here. 


Key Generation. Let & € N be an even security parameter (e.g. k = 1024). 
Bob generates a (k + 2)-bit modulus n = pq, with 2*+! <n < 2*+149* (ie., 
the two most significant bits of n are 10), where p and q are (k/g + 1)-bit 
primes, with p= q= 3mod4. The primes p,q are chosen randomly. The 
public key is n, the private key is (p,q). 

The security parameter is split into 3 parts, k = 1+ so + 51, with | < 
k/4 and 1+ 89 < k/g. In practice, typical values for these parameters are 
k = 1024,1 = 256, so = 128, 5; = 640. The constraints on the lengths of the 
security parameters are necessary for the security proof (see Theorem 9.19 
below). 

We make use of a collision-resistant hash function 


h: {0,1}! — {0,1}, 


Notation. As usual, we denote by 0” (or 1") the constant bit string 000...0 
(or 111...1) of length r (r € N). As always, let || denote the concatenation 
of strings and @ be the bitwise XOR operator. 


Encryption. To encrypt an I-bit message m € {0,1}! for Bob, Alice pro- 
ceeds in the following steps: 


1. She chooses a random bit string r € {0,1}*. 
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2. She appends sp 0-bits to m to obtain the (J + s9)-bit string « := m|0°*°. 

3. She sets y = (x @ A(r))|r. 

4. She views the k-bit string y as a k-bit integer and applies the Rabin 
trapdoor function modulo n to obtain the ciphertext c: 


c:= y* mod n. 


Note that y < 2* < 7/9. 
Remarks: 


1. At a first glance, the length / of the plaintexts might appear small (recall 
that a typical value is / = 256). But usually we encrypt only short data 
by using a public-key method, for example, session keys for a symmetric 
cipher, such as Triple-DES or AES, and for this purpose 256 bits are 
really sufficient. 

2. All ciphertexts are quadratic residues modulo n. But not every quadratic 
residue appears as a ciphertext. Let c be a quadratic residue modulo n. 
Then c is the encryption of some plaintext in {0,1}!, if there is a square 
root y of c modulo n, such that 

a. y < 2*, i.e., we may consider y as a bit string of length k, and 
b. the so least-significant bits of v @ h(r) are all 0, where y = vr, v € 
{0,1}!+90, r € {0,1}%. 
In this case, the plaintext m, whose encryption is c, consists of the 1 
most-significant bits of v @ h(r), i.e., v @ A(r) = m0, and we call ca 
valid ciphertext (or a valid encryption of m) with respect to y. 


The security of SAEP is based on the factoring assumption: it is practi- 
cally infeasible to compute the prime factors p and q of n (see Section 6.5 
for a precise statement of the assumption). Decrypting ciphertexts requires 
drawing square roots. Without knowing the prime factors of n, it is infeasible 
to compute square roots modulo n. The ability to compute modular square 
roots is equivalent to the ability to factorize the modulus n (Proposition A.62, 
Lemma A.63). Bob can compute square roots modulo n because he knows 
the secret factors p and q. 

We recall from Proposition A.62 some basics on computing square roots. 

Let c € Z, be a quadratic residue, c £ 0. If c is prime to n, then c has 4 
distinct square roots modulo n. If cis not prime to n, i.e., if cis a multiple of 
p or q, then c has only 2 square roots modulo n. In the latter case, the factors 
p,q of n can be easily derived by computing gcd(c,n) with the Euclidean 
algorithm. The probability for this case is negligibly small, if c is a random 
quadratic residue. 

If y? mod n = ¢, then also (n — y)? mod n = c. Hence, exactly two of the 
4 roots (or one of the two roots) are < "/9. 

The residue [0] € Z,, has the only square root [0]. 

If both primes p and q are = 3 mod 4, as here in SAEP, and if the factors 
p and q are known, then the square roots of c can be easily and efficiently 
computed as follows: 
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1. P+1/4 and 7+ 1/4 are integers, and the powers z» = c?+)/4 mod p of 
emod pand zy = c\'+!)/4 mod ¢ of c mod q are square roots of c modulo 
p and modulo q (see Proposition A.60). Recall that 23 = cPtl)/2 — 
c-c®—-1)/? and ce-)/? = 1 mod p, if emod p ¥ 0 (see Euler’s criterion, 
Proposition A.52). If cmod p ¥ 0, then +z, are the two distinct square 
roots of c mod p. If c mod p = 0, then there is the single root zp = 0 of c 
modulo p. 

2. To get the square roots of c modulo n, we map (+2,,+2q) to Zn by 
applying the inverse Chinese remainder map. If c is prime to n, then we 
obtain 4 distinct roots. If z, = 0 or zy = 0, we get 2 distinct roots, and 
if both z, and zq are 0, then there is the only root 0 (see Proposition 
A.62). 


Decryption. Bob decrypts a ciphertext c by using his secret key (p,q) as 
follows: 


1. He computes the square roots of c modulo n, as just described. 

In this computation, he tests that z; = cmod p and | = cmodq. If 
either test fails, then c is not a quadratic residue and Bob rejects c. 

2. Two of the four roots (or one of the two roots) are > "/2 and hence can 

be discarded. Bob is left with 2 square roots y1, y2 (let y1 = ye, if there 
is only one left). If neither of y1, y2 is < 2", then Bob rejects c. 
From now on, we assume that y; < 2”. If yo > 2", then Bob does not 
have to distinguish between two candidates, and thus he can simplify the 
following steps (omit the processing of y2). So, assume now that both y; 
and yz are < 2". Then, we may view them as strings in {0,1}*. 

3. Bob writes y; = v |r, and yo = vallro, with v1, v2 € {0,1}'+% and 
r1,T2 € {0,1}*, and computes 2; = v; ® A(r,) and x2 = v2 @ h(r2). 

4. He writes 21 = m4|t; and x2 = me|t2, with m1, m2 € {0,1}! and ty, te € 
{0,1}*°. If either none or both of t1,t2 are 00...0, then Bob rejects c. 
Otherwise, let i € {1,2} be the unique 7 with ¢t; = 00...0. Then, m; is 
the decryption of c. 


Remark. It might happen that Bob rejects a valid encryption c of a message 
m, because c is valid with respect to both roots y; and yg and hence both 
t; and tg are 00...0. But the probability for this event is negligibly small, at 
least if h comes close to a random function. 

Namely, assume y; # y2 and c is a valid encryption of m with respect to 
yi, Le., yr = ((m]0°°) @ h(r1))|\r1. If A is assumed to be a random function, 
then the probability that c is also a valid ciphertext with respect to yo is 
about 1/2:0. There are 2 cases. If r; # ro, then h(rz) is randomly generated 
independently from h(r1). Hence, the probability that the so least-significant 
bits of h(r2) are just the so least-significant bits of v2 is 1/250. If ry = re and 
cis valid with respect to both roots, then the (so + $1) least-significant bits 
of y; and yp coincide, ie., yo = yy + 2°°+816, with absolute value |6| < 2!. 
Since |5| < 2! < 2*/? < p,q, we know that 6 is a unit modulo n. From 
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ye = (yr + 2804815)? = y2 4 Qeotsitlgy, 4 Q2(sotsi)§2 = ¢ = y? modn, 
we conclude that y; = —2°%°+%1~1§ mod n, and the probability for that is 
1 /gs0t+s1-1, 

The rejection of valid ciphertexts can be completely avoided by a slight 
modification of the encryption algorithm. Alice repeats the random genera- 
tion of r, until y has Jacobi symbol 1. Then Bob can always select the correct 
square root by taking the unique square root < 2" with Jacobi symbol 1. We 
described a similar approach in Section 3.6.1 on Rabin’s encryption. However, 
this makes the encryption scheme less efficient, and it is not necessary. 


The proof of security for SAEP is based on an important result due to 
Coppersmith ([Coppersmith97]). 


Theorem 9.18. Let n be an integer, and let f(X) € Z,[X] be a monic 
polynomial of degree d. Then there is an efficient algorithm which finds all 
x €Z such that the absolute value |a| <n'/¢ and f(x) = 0 mod n. 


For a proof, see [Coppersmith97]. The special case f(X) = X¢—¢,c € Zn, 
is easy. To get the solutions x with || < n!/4, you can compute the ordinary 
d-th roots of c, because for 0 < x < n'/¢ we have «? mod n = x4. To compute 
the ordinary d-th roots is easy, since 2 +> «7 (without taking residues) is 
strictly monotonic (take, for example, the simple Algorithm 3.4). 


Theorem 9.19. Assume that the hash function h in SAEP is a random 
oracle. Let n = pq be a key for SAEP with security parameter k = 1+ so + 
8; (i.e, tl <n < 24149"). Assume 1 < k4 and1l+ so < kA. Let 
A(n,l, 50,51) be a probabilistic distinguishing algorithm with running time t 
that performs an adaptively-chosen-cipherteat attack against SAEP and has 
a probability of success > 14+. Let qq be the number of A’s decryption 
queries, and let q, be the number of A’s queries of the random oracle h. 
Then there is a probabilistic algorithm B for factoring the modulus n with 


running time t+O(qagntc+qatc) and 


— 2qa 24a 
280 Qs1 } 


1 
probability of success > 6 on (1 


Here, tc (resp. t¢,) is the running time of Coppersmith’s algorithm for finding 
“small-size” roots of polynomials of degree 2 (resp. 4) modulo n (i.e., roots 
with absolute value < n'/2 resp. < ni/4 ), 


Remarks: 


1. Recall that typical values of the parameters are k = 1024,] = 256, 59 = 
128, s; = 640. The number qq of decryption queries that an adversary 
can issue in practice should be limited by 24°. We see that the fractions 
2qa/950 and 244/951 are negligibly small. 
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2. Provided the factoring assumption is true, we conclude from Boneh’s the- 

orem that a probabilistic polynomial attacking algorithm like A, whose 
probability of success is > 1/2 + 1/P(k), P a positive polynomial (i.e., 
A has a non-negligible “advantage” ), cannot exist. Otherwise, B would 
factor moduli n in polynomial time. Thus, SAEP is indistinguishability- 
secure against adaptively-chosen-ciphertext attacks (in the random oracle 
model). 
But Boneh’s result is more precise: It gives a tight reduction from the 
problem of attacking SAEP to the problem of factoring. If we had a 
successful attacking algorithm A against SAEP, we could give precise 
estimates for the running time and the probability of successfully factor- 
ing n. Or, conversely, if we can state reasonable bounds for the running 
time and the probability of success in factoring numbers n of a given 
bit length, we can derive a concrete lower bound for the running time of 
algorithms attacking SAEP with a given probability of success. 


Proof. We sketch the basic ideas of the proof. In particular, we illustrate 
how the random oracle assumption is applied. For more details we refer to 
[BonehO1]. 

It suffices to construct an algorithm S which on inputs n,l, 89,5, and 
c= a? mod n, for a randomly chosen a with 0 < a < 2*, outputs a square 
root a’, 0 < a’ < 2*, of c with probability > e - (1 — 24a/gs0 — 2qa/9s1), 
Namely, with probability > 1/3, a number c with 0 < ¢ < n has two distinct 
square roots modulo n in [0,2*[ °. Hence a 4 a’ with probability > 1/g and 
then n can be factored by computing gcd(n,a — a’) (see Lemma A.63 and 
Section 3.6.1). 

In the following we describe the algorithm S. The algorithm efficiently 
computes a square root < 2" of c, without knowing p,q, in two cases, which 
we study first. 


1. Let y = v|r, where v and r are bit strings of lengths (1+ so) and 1. If y 
is a root of c, then v is a root of the quadratic polynomial (2°! X +r)? —¢ 
modulo n, with 0 < v < 2'+80 < Qk/2 < 1/2, 

Thus, if S happens to know or guess correctly the s; lower significant 
bits r of a root y, then it efficiently finds y by applying the following 
algorithm 

CompRoot;(c,r): 

Compute the roots v < 2*/? of (28!X +r)? —c¢ modulo n by using 
Coppersmith’s algorithm. If such a v is found, return the root y = vr. 

2. Let y = mllw be a square root modulo n of c, with m € {0,1}! and 
w € {0,1}. Let c’ € Z, be a further quadratic residue, c' # c, 
and assume that c’ has a k-bit square root y’ = m’|w modulo n, whose 
(so + 51) lower significant bits w are the same as those of y. Then we 


3 See Fact 2 in [Boneh01]; n is chosen between 2**1 and 2*t! + 2° to get this 
estimate. 
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may write y’ = y+2%°t*19, where 6 = m’—™m is an integer with absolute 
value |6| < 2! < 2*/4 < n4/4, y is a common root of the polynomials 
f(X) = X?-cand g(X, 5) = (X +2%°F516)? —c! modulo n. Therefore, 6 
is a root of the resultant R of f(X) and g(X, A) = (X + 2% A)? —¢. 
The resultant R is a polynomial modulo n in A of degree 4 (see, for 
example, [Lang05], Chapter IV). Since |6| < 2! < 2*/4 < n\/4, we can 
compute 6 efficiently by using Coppersmith’s algorithm for polynomials 
of degree 4. The greatest common divisor gcd(f(X), g(X,6)) is X — y. 
Thus, if S happens to get such an element c’, then it efficiently finds a 
square root y of c by applying the following algorithm 

CompRoot2(c, c’): 

Compute the roots 6 with absolute value |6| < 2! of the resultant R mod- 
ulo n by using Coppersmith’s algorithm. Compute the greatest common 
divisor X — y of X? —c and (X + 2%°+816)? — cc’ modulo n. If such a 6 
(and then y) is found, return the root y. 


Algorithm S' interacts with the attacking algorithm A. In the real at- 
tack, A interacts with Bob, the legitimate owner of the public-secret key pair 
(n, (p, g)), to obtain decryptions of ciphertexts of its choice, and with the ran- 
dom oracle h to get hash values h(m) (in practice, interacting typically means 
to communicate with another computer program). Now S is constructed to 
replace both Bob and the random oracle h in the attack. It “simulates” Bob 
and h. 

Each time, when A issues a query, S has a chance to compute a square 
root of c, and of course S' terminates when a root is found. 

S has no problems answering the hash queries. Since h is a random oracle, 
S can assign a randomly generated v € {0,1}!*%° as hash value h(r). The 
only restriction is: If the hash value of m is queried more than once, then 
always the same value has to be provided. Therefore, S' has to store the list 
H of hash value pairs (r, h(r)) that it has given to A. 

The structure of S is the following. 

S calls A with the public key n and the security parameters 1, 59,51 as 
input. Then it waits for the queries of A. 


1. If A queries the hash value for r, then 
a. if (r, h(r)) is on the list H of previous responses to hash queries, then 
S again sends h(r) as hash value to A; 
b. else S applies algorithm CompRoot,(c,r); if S finds a root of c mod- 
ulo n in this way, then it returns the root and terminates; 
c. else S picks a random bit string v of length (1+ 59), puts (r, v) on its 
list 7 of hash values and sends v = h(r) as hash value to A. 
2. if A queries the decryption of a ciphertext c’, then 
a. S applies algorithm CompRoot2(c,c’); if S finds a root of c modulo 
n in this way, it returns the root and terminates; 
b. else S applies, for each (r’, h(r’)) on the list 1 of previous hash values, 
algorithm Comp Root, (c',r’); if S finds a square root v’ |r’ of c’ in this 
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way, then it computes w’ = v' @ h(r’); if the so lower significant bits 
of w’ are all 0, ie., w’ = m’|0*°, then S sends m’ as plaintext to A, 
else S rejects c! as an “invalid ciphertext” ; 
c. else S could not find a square root of c’ in the previous step b and 
rejects c’ as an “invalid ciphertext”; 
3. if A produces the two candidate plaintexts mo,m1 € {0,1}!, then S sends 
cas encryption of mg or m, to A. 


If A terminates, then S also terminates (if it has not terminated before). 


To analyze algorithm S$, let y,; and yo be the two square roots of c with 
0 < y1,y2 < 2” (take yo = y:, if there is only one). The goal of S is to 
compute y; or y2. We decompose y; = 4; |r;, with r; € {0,1}%. 

We consider several cases. 


1. If A happens to query the hash value for r; or rg, then S successfully 
finds one of the roots y1, y2 in step 1b. 

2. If A happens to query the decryption of a ciphertext c’, and if c’ has a 
k-bit square root y’ modulo n, whose (s9 + 81) lower significant bits are 
the same as those of y; or y2, then S successfully finds one of the roots 
Y1, y2 In step 2a. 


We observed above that S can easily play the role of the random oracle 
h and answer hash queries. Sometimes, S$ can answer decryption requests 
correctly. 


3. If A queries the decryption of a ciphertext c’ that is valid with respect 
to its square root y’ modulo n, and if A has previously asked for the 
hash value h(r’) of the s; rightmost bits r’ of y’ (ec = y mod n;y’ = 
u'|r’,r’ © {0,1}*), then S responds in step 2b to A with the correct 
plaintext. 


But S does not know the secret key, and so S' cannot perfectly simulate 
Bob. We now study the two cases where, from A’s point of view, the behavior 
of S might appear different from Bob’s behavior. 

In the real attack, Bob sends a valid encryption of mo or m, as challenge 
ciphertext to A. The choice of the number c, which S sends as challenge 
ciphertext, has nothing to do with mo or m,. Therefore, we have to study 
the next case 4. 


4. The number c, which S' presents as challenge ciphertext at the end of 
phase I, is, from A’s point of view, not a valid encryption of mo or m1 with 
respect to y;, where 7 = 1 or i = 2. This means that v; 6 h(r;) 4 mp||0*° 
or, equivalently, h(r;) 4 (mp|0°°) @ vu; for b = 0,1. 

This can only happen either 
e if A has asked for h(r;) in phase I, and S has responded with a non- 
appropriate hash value for r;, or 
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e if A has asked in phase I for the decryption of a ciphertext c’ with 
ce =y? mod ny’ = v'|r’,r’ € {0,1}51,v’ € {0,1}/+%0 and r’ = ry, ie., 
the so rightmost bits of y’ and y; are equal‘; in this case the answer of 
S' might have put some restriction on h(r;). 

Otherwise, the hash value h(r;), which is randomly generated by the 

oracle, is independent of A’s point of view at the end of phase I. This 

implies that, from A’s point of view, h(r;) = (m»|0*°) @ v; has the same 

probability as h(r;) = w, for any other (1+ s9)-bit string w. Therefore c 

is a valid encryption of mo or m1 with respect to y;. 

That A has asked for h(r;) before can be excluded, because then S' has 

already successfully terminated in step 1b. 

The number c is generated by squaring a randomly chosen a, and c is not 

known to A in phase I. Therefore, the random choice of c is independent 

from A’s decryption queries in phase I. Hence, the probability that a 

particular decryption query in phase I involves rj; or rg is 2-1/9:1. There 

are at most gq decryption queries. Thus, the probability of case 4 is 

< qa- 2/28. 

5. Attacker A asks for the decryption of a ciphertext c’ and S rejects it (in 
step 2b or step 3), whereas Bob, by using his secret key, accepts c’ and 
provides A with the plaintext m’. Then, we are not in case 2, because, 
in case 2, S' successfully terminates in step 2a before giving an answer to 
A. 

We assume that c is a valid ciphertext for mo or m1 with respect to both 

square roots y; and yo, i.e., that we are not in case 4. This means that 

for 1 = 1 and i = 2, we have v @ h(r;) = m||0°° for b=0 or b= 1. 

Decrypting c’, Bob finds: 


¢ =y? mod n,y' = v'|r’,v' € {0, 1} +70, p! € {0,1}*, u'@h(r’) = m’|0°. 


If r’ = r; for i = 1 or i = 2, then the so rightmost bits of v’ and v are 
the same — they are equal to the so rightmost bits of h(r’) = h(r;). Then 
the (so + 81) rightmost bits of y’ and y; coincide and we are in case 2, 
which we have excluded. Hence r’ 4 r; and 1’ 4 ro. 

The hash value h(r’) has not been queried before, because otherwise we 
would be in case 3, and S would have accepted c’ and responded with 
the plaintext m’. 

Hence, the hash value A(r’) is a random value which is independent from 
preceding hash queries and our assumption that c is a valid encryption 
of mo or m1 (which constrains h(r ) and h(r2), see above). But then the 
probability that the so rightmost bits of h(r’) ® v’ are all 0 (which is 
necessary for Bob to accept) is 1/250. Since there are at most 2 square 
roots y’, and A queries the decryption of qq ciphertexts, the probability 
that case 5 occurs is < qq: 2/2°0. 


4 Note that S successfully terminates in step 2a only if there are (so +s1) common 
rightmost bits. 
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Now assume that neither case 4 nor case 5 occurs. Following Boneh, we 
call this assumption GoodSim. 

Under this assumption GoodSim, S behaves exactly like Bob and the 
random oracle h. Thus, A operates in the same probabilistic setting, and it 
therefore has the same probability of success 1/2 + as in the original attack. 

If, in addition, cases 1 and 2 do not occur, then the randomly generated c 
(and r,,7r2), and the hash values h(r1) and h(r2), which are generated by the 
random oracle, are independent of all hash and decryption queries issued by A 
and the responses given by S. Therefore A has not collected any information 
about A(r,) and h(r2). This means that the ciphertext c hides the plaintext 
mp perfectly in the information-theoretic sense — the encryption includes a 
bitwise XOR with a truly random bit string h(r;). We have seen in Section 
9.2 that then A’s probability of correctly distinguishing between mg and m, 
is 1/o. 

Therefore, the advantage ¢ in A’s probability of success necessarily results 
from the cases 1, 2 (we still assume GoodSim). In the cases 1 and 2, the 
algorithm S successfully computes a square root of c. Hence the probability 
of success prob, ccess(9|GoodSim) of algorithm S assuming GoodSim is > 
é. The probability of the cases 4 and 5 is < 2qa/9s1 + 2¢a/950 and hence 
prob(GoodSim) > 1—24a/2s1 — 244/980, and we conclude that the probability 
of success of S' is 


> prob(GoodSim) - probyuceess(9 |GoodSim) > «- (1 - st - sit) : 
The running time of S is essentially the running time of A plus the running 
times of the Coppersmith algorithms. The algorithm for quadratic polyno- 
mials is called after each of the gp, hash queries (step 1b). After each of the 
da decryption queries, it may be called for all of the < q, known hash pairs 
(r, h(r)) (step 2b). The algorithm for polynomials of degree 4 is called after 
each of the gq decryption queries (step 2a). Thus we can estimate the running 
time of S by 


t+ O(qagntc + atc), 


and we see that the algorithm S' gives a tight reduction from the problem of 
attacking SAEP to the problem of factoring. 


Remark. True random functions can not be implemented in practice. There- 
fore, a proof in the random oracle model — treating hash functions as equiv- 
alent to random functions — can never be a complete proof of security for 
a cryptographic scheme. But, intuitively, the random oracle model seems to 
be reasonable. In practice, a well-designed hash function should never have 
any features that distinguish it from a random function and that an attacker 
could exploit (see Section 3.4.4). 

In recent years, doubts about the random oracle model have been ex- 
pressed. Examples of cryptographic schemes were constructed which are prov- 
ably secure in the random oracle model, but are insecure in any real-world 
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implementation, where the random oracle is replaced by a real hash function 
({CanGolHal98]; [GolTau03]; [GolTau03a]; [MauRenHol04]; [CanGolHal04]). 
However, the examples appear contrived and far from systems that would be 
designed in the real world. The confidence in the soundness of the random 
oracle assumption is still high among cryptographers, and the random oracle 
model is still considered a useful tool for validating cryptographic construc- 
tions. See, for example, [KobMen05] for a discussion of this point. 

Nevertheless, it is desirable to have encryption schemes whose security 
can be proven solely under a standard assumption that some computational 
problem in number theory can not be efficiently solved. An example of such 
a scheme is given in the next section. 


9.5.2 Security Under Standard Assumptions 


The Cramer-Shoup Public Key Encryption Scheme. The security 
of the Cramer-Shoup public key encryption scheme ([CraSho98]) against 
adaptively-chosen-ciphertext attacks can be proven assuming that the deci- 
sion Diffie-Hellman problem (Section 4.5.3) can not be solved efficiently and 
that the hash function used is collision-resistant. The random oracle model is 
not needed. In Chapter 10, we will give examples of signature schemes whose 
security can be proven solely under the assumed difficulty of computational 
problems (for example, Cramer-Shoup’s signature scheme). 

First, we recall the decision Diffie-Hellman problem (see Section 4.5.3). 

Let p and q be large prime numbers, such that q is a divisor of p—1, and 
let Gg be the (unique) subgroup of order q of Zi. Gy is a cyclic group, and 
every element g € Z;, of order q is a generator of G, (see Lemma A.40). 

Given 91,1 = 97,92 = gi, U2 with random elements gi,ug € Gy and 
randomly chosen exponents z,y € Zj , decide if uz = g{¥. This is equivalent 
to decide, for randomly (and independently) chosen elements g1, U1, g2, U2 € 
Gy, if 

log,, (u2) = log,, (u1). 


If the equality holds, we say that (g1, v1, g2, 2) has the Diffie-Hellman prop- 
erty. 

The decision Diffie-Hellman assumption says that no probabilistic poly- 
nomial algorithm exists to solve the decision Diffie-Hellman problem. 


Notation. For pairs u = (ui,u2), © = (#1,%2) and a scalar value r we 
shortly write u® := ujtus?,u"™™ := uy" us”. 


Key Generation. Bob randomly generates large prime numbers p and gq, 
such that q is a divisor of p— 1. He randomly chooses a pair g = (91, g2) of 
elements 91, 92 € Gq.” 


> Compare, for example, the key generation in the Digital Signature Standard, 
Section 3.5.3. 
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Then Bob randomly chooses three pairs of exponents x = (21,%2),y = 
(y1,y2),2 = (41,22), with 71,272,491, ye, 21,22 € Zi and computes modulo 
D: 

d= =Gi'97 =P HN Sf =G =i 97. 


Bob’s public key is (p,q, 9, d,e, f), his private key is (a, y, z). 
For encryption, we need a collision-resistant hash function 
ReqO LZ {0 ga fe 


h outputs bit strings of length |g| = |log,(q)| +1. In practice, as in DSS, we 
might have |g| = 160 and h = SHA-1, see Section 3.5.3. 


Encryption. Alice can encrypt messages m € G, for Bob, i.e., elements m 
of order q in Z}. To encrypt a message m for Bob, Alice chooses a random 
r € Zi and computes the ciphertext 


c= (ui, ug, W, v) = (91595, f “™, v), with v:= ger he), 


Note that the Diffie-Hellman property holds for (91, u1, g2, v2). 


Decryption. Bob decrypts a ciphertext c = (ui, u2,w,v) by using his pri- 
vate key (x,y, z) as follows: 


1. Bob checks the verification code v. He checks if utty(142”) = y, If 
this equation does not hold, he rejects c. 
2. He recovers the plaintext by computing w-u7*. 


Remarks: 


1. We follow here the description of Cramer-Shoup’s encryption scheme in 
[KobMen05]. The actual scheme in [CraSho98] is the special case with 
zg = 0. Cramer and Shoup prove the security of the scheme using a 
slightly weaker assumption on the hash function. They assume that h is 
a member of a universal one-way family of hash functions (as defined in 
Section 10.4). 

2. If c = (uj, U2, w,v) is the correct encryption of a message m, then the 
verification code v passes the check in step 1, 


ye ty (ur u2,w) = (g’)® : (7 )¥ hur u2,w) 


_ (Gey . (gi) Pua ua.w) =". et h(ur,u2,w) —y 


e 


and the plaintext m is recovered in step 2, 


f’ =(97)' =(g")* =u’, hence w-u-* = m- f"- f°" =m. 
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3. The private key (x, y, z) = ((1, 2), (y1, Y2), (21, 22)) is an element in the 
Z,-vectorspace Z°. Publishing the public key (d,e, f) means to publish 
the following conditions on (2, y, z): 


(j)d=9", (2)e=g", (3) f=g’. 


These are linear equations for 71, %2, Y1, Y2, 21, 22. They can also be writ- 
ten as (with  := log,, (g2)) 


(1) log,, (d) =a1 + A+ a2 
(2) log,, (e) = yi tA- Ye 
(3) log,, (f) zy + A> Z2. 


The equations define lines Lz, Ly, Lz in the plane Zz. For a given public 
key (d,e, f), the private key element x (resp. y, z) is a (uniformly dis- 
tributed) random element of L, (resp. Ly, Lz); each of the elements in 
L, (resp. Ly, L,) has the same probability I, of being x (resp. y, 2). 

4. Let c := (u1, U2, w,v) be a ciphertext-tuple. Then c is accepted for de- 
cryption only if the verification condition (4) v = u*t¥%, with a = 
h(u1, u2,w), holds. The verification condition is a linear equation for x 
and y: 


(4) log,, (v) = logy, (u1)a1t+Alog,, (u2)v2+a log,, (u1)yi tar log,, (U2) ye. 


Equations (1), (2) and (4) are linearly independent, if and only if 

logy, (ur) # logy, (u2). 

The ciphertext-tuple c is the correct encryption of a message, if and only 
if c satisfies the verification condition and log,, (ui) = log,, (ua), ie., 
(g1, U1, 92,2) has the Diffie-Hellman property. In this case, we call ca 
valid ciphertext. 

Now assume that log,,(ui) # log,,(wz). Then, the probability that c 
passes the check of the verification code and is not rejected in the de- 
cryption step 1 is 1/g and hence negligibly small. Namely, (1), (2) and 
(4) are linearly independent. This implies that (x,y) is an element of the 
line in Z{ which is defined by (1), (2), (4). The probability for that is 1/g, 
since (x,y) is a random element of the 2-dimensional space L, x Ly. 


Theorem 9.20. Let the decision Diffie-Hellman assumption be true and let 
h be a collision-resistant hash function. Then the probability of success of any 
attacking algorithm A, which on input of a random public key (p,q, 9, 4d, e, f) 
executes an adaptively-chosen-ciphertext attack and tries to distinguish be- 
tween two plainterts, is <1/o+.e, with € negligible. 


Proof. We discuss the essential ideas of the proof. For more details, we refer 
to [CraSho98]. 
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The proof runs by contradiction. Assume that there is an algorithm A 
which on input of a randomly generated public key successfully distinguishes 
between ciphertexts in an adaptively-chosen-ciphertext attack, with a prob- 
ability of success of 1/2 + €, € non-negligible. 

Then we can construct a probabilistic polynomial algorithm S which an- 
swers the decision Diffie-Hellman problem with a probability close to 1, in 
contradiction to the decision Diffie-Hellman assumption. The algorithm S' 
successfully finds out whether a random 4-tuple (91, v1, g2, U2) has the Diffie- 
Hellman property or not. 

As in the proof of Theorem 9.19, algorithm S interacts with the attack- 
ing algorithm A. In the real attack, A interacts with Bob, the legitimate 
owner of the secret key, to obtain decryptions of ciphertexts of its choice (in 
practice, interacting typically means to communicate with another computer 
program). Now S is constructed to replace Bob in the attack. S “simulates” 
Bob. 

On input of (p,q, 91, U1, 92, U2) the algorithm S repeatedly generates a 
random private key (x,y,z) and computes the corresponding public key d = 
g’,e = g’,f = g*. Then S calls A with the public key, and A executes 
its attack. The difference to the real attack is that A communicates with S 
instead of Bob. At some point (end of phase I of the attack), A outputs two 
plaintexts mg,m ,. S randomly selects a bit b € {0,1}, sets 

w= ums, a := h(us, ue, w),v = ute 
and sends c := (u1,U2,w,v) to A, as an encryption of mp. The verification 
code v is correct by construction, even if log,, ui 4 log,, u2 and c is not a 
valid encryption of mp. 

At any time in phase I or phase II, A can request the decryption of a 
ciphertext c’ = (uj,us,w"’,v’), ¢ Ac. If c’ satisfies the verification condition, 
then equation (4) holds for c’, i.e., we have, with a’ = h(u}, ud, w’): 


(5) logs, (v') = log, (wi Jari + Alogy, (up) #2 + a logy, (ui)yi + a” A logy, (wy) y2- 


We observe: If log,,(ui) # log,,(w2) and log, (ui) € log,,(uy), then the 
equations (1), (2), (4), (5) are linearly independent, if and only if a’ # a. 

If c satisfies the verification condition, then, up to negligible probability, 
c is a valid ciphertext (see Remark 4 above) and S' answers with the correct 
plaintext. S can easily check the verification condition and decrypt the ci- 
phertext, because it knows the secret key. Here, S behaves exactly like Bob 
in the real attack. 

If (g1,U1,92,U2) has the Diffie-Hellman property, then the ciphertext 
c:= (u1,U2,W,v), which S presents to A as an encryption of mp, is a valid 
ciphertext, and the probability distribution of c is the same, as if Bob pro- 
duced the ciphertext. Hence, in this case, A operates in the same setting as 
in the real attack. 
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If (gi, U1, 92, U2) does not have the Diffie-Hellman property, then A op- 
erates in a setting which does not occur in the real attack, since Bob only 
produces valid ciphertexts. We do not know how attacker A responds to the 
modified setting. For example, A could fail to terminate in its expected run- 
ning time or it could output an error message or it could produce the usual 
output and answer, whether mo or m, is encrypted. But fortunately the 
concrete behavior of A is not relevant in this case: if (g1, u1,g2, 2) does not 
have the Diffie-Hellman property, then it is guaranteed that up to a negligible 
probability, my is perfectly hidden in w = u*my to A and A can not generate 
any advantage from knowing w in distinguishing between mp and m,. 

To understand this, assume that (g1, U1, g2, U2) does not have the Diffie- 
Hellman property. Then the probability that attacker A does get some infor- 
mation about the private key z by asking for the decryption of ciphertexts c’ 
of its choice, c’ 4 c, is negligibly small. 

Namely, let A ask for the decryption of a ciphertext c’ = (uj,us,w’,v’). 
Let a’ = h(u},us,w’). 

Since we assume that log, (wi) # log,,(u2), equations (1), (2), (4) are 
linearly independent and define a line L. From A’s point of view, the private 
key (a, y) = (#1, £2, y1, y2) is a random point on the line L. 

We have to distinguish between several cases. 


1. a’ = a. Since h is collision-resistant®, the probability that A can gen- 
erate a triple (uj,u,w’) A (ui, Ue, w), with h(uj,us,w’) = a =a= 
h(u1, U2, w), is negligibly small. Hence the case a’ = a can happen, up 
to a negligible probability, only if (w,u4,w’) = (u1,u2,w). But then 
v' # v, because c! # c, and hence v! ¢ (u’)*t¢'Y = u2+%Y = v, and C' is 
rejected. 

2. Now assume that a’ # a and (91, u},92,U5) does not have the Diffie- 
Hellman property. If c’ is not rejected, then the following equation (5) 
holds (it is equation (4) stated for c’): 


(5) log, (v’) 


=log,, (ui )ar+ Alog,, (us)x2 + a" log,, (ui yi + a’d log,, (uy) y2- 


Since a’ # a, equation (5) puts, in addition to (1), (2), (4), another lin- 
early independent condition on (x,y). Hence, at most one point (%, 4) 
satisfies these equations. Since (z,y) is a randomly chosen element on 
the line L, the probability that (x,y) = (%,%) is < 1/q. Hence, up to a 
negligible probability, c’ is rejected. 

3. The remaining case is that a’ # a and (91, u4,92,u5) has the Diffie- 
Hellman property. Either the check of the verification code v’ fails and 
c’ is rejected or equation (5) holds. In the latter case, S decrypts c’ 
and provides A with the plaintext m’. Thus, what A learns from the 


® Second-pre-image resistance would suffice. 
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decryption is the equation (6) m! = w’-u’~*, which can be reformulated 
as 


(6) logy, (w’ +m’) = log, (ui)z1 + Alog,, (uy) 22 
= log,, (ui) (41 + Aza). 


But this equation is linearly dependent on (3), so it does not give A 
any additional information on z. A learns from m’ = w’- u/~* the same 
information that it already knew from the public-secret-key equation 
(3) f = e*: the private key z is a random element on the line L,. 


Summarizing, if (g1, u1, g2, U2) does not have the Diffie-Hellman property, 
then up to a negligible probability, from attacker A’s point of view, the secret 
key z is a random point on a line in Zi, . This means that from A’s point of 
view, my is perfectly (in the information-theoretic sense) hidden in w = f*myp 
by a random element of G,. We learnt in Section 9.2 that therefore A’s 
probability of distinguishing successfully between mp and m, is exactly 1/2. 
Hence, the non-negligible advantage ¢ of A results solely from the case that 
(g1, U1, 92, U2) has the Diffie-Hellman property. 

In order to decide whether (g1, u1, 92,2) has the Diffie-Hellman prop- 
erty, S repeatedly randomly generates private key elements (x, y, z) and runs 
the attack with A. If A correctly determines which of the messages m, was 
encrypted by S' in significantly more than half of the iterations, then S$ can 
be almost certain that (g1, u1, 92,2) has the Diffie-Hellman property. Oth- 
erwise, S is almost certain that it does not. 


9.6 Unconditional Security of Cryptosystems 


The security of many currently used cryptosystems, in particular that of all 
public-key cryptosystems, is based on the hardness of an underlying computa- 
tional problem, such as factoring integers or computing discrete logarithms. 
Security proofs for these systems show that the ability of an adversary to 
perform a successful attack contradicts the assumed difficulty of the com- 
putational problem. A typical security proof was given in Section 9.4. We 
proved that public-key one-time pads induced by one-way permutations with 
a hard-core predicate are computationally secret. Thus, the security of the 
encryption scheme is reduced to the one-way feature of function families, 
such as the RSA or modular squaring families, and the one-way feature of 
these families is, in turn, based on the assumed hardness of inverting modu- 
lar exponentiation or factoring a large integer (see Chapter 6). The security 
proof is conditional, and there is some risk that in the future, the underlying 
condition will turn out to be false. 

On the other hand, Shannon’s information-theoretic model of security 
provides unconditional security. The perfect secrecy of Vernam’s one-time 
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pad is not dependent on the hardness of a computational problem, or limits 
of the computational power of an adversary. 

Although perfect secrecy is not reachable in most practical situations, 
there are various promising attempts to design practical cryptosystems whose 
security is not based on assumptions and which provably come close to perfect 
information-theoretic security. 

One approach is quantum cryptography, introduced by Bennett and Bras- 
sard. Two parties agree on a secret key by transmitting polarized photons 
over a fiber-optic channel. The secrecy of the key is based on the uncertainty 
principle of quantum mechanics ([BraCre96]). 

In other approaches, the unconditional security of cryptosystems is based 
on the fact that communication channels are noisy (and hence, an eaves- 
dropper never gets all the information), or on the limited storage capacity of 
an adversary (see, e.g., [Maurer99] for an overview on information-theoretic 
cryptography). 


9.6.1 The Bounded Storage Model 


We will give a short introduction to encryption schemes designed by Maurer et 
al., whose unconditional security is guaranteed by a limit on the total amount 
of storage capacity available to an adversary. Most of the encryption schemes 
studied in this approach are similar to the one proposed in [Maurer92], which 
we are going to describe now. 

Alice wants to transmit messages m € M := {0,1}” to Bob. She uses a 
one-time pad for encryption, i.e., she XORs the message m bitwise with a 
one-time key k. As usual, we have a probability distribution on M which, 
together with the probabilistic choice of the keys, yields a probability distri- 
bution on the set C' of ciphertexts. Without loss of generality, we assume that 
prob(m) > 0 for all m € M, and prob(c) > 0 for all c € C. The eavesdropper 
Eve is a passive attacker: observing the ciphertext c € CC {0,1}”, she tries 
to get information about the plaintext. 

Alice and Bob extract the key k& for the encryption of a message m 
from a publicly accessible “table of random bits”. Security is achieved if 
Eve has access only to some part of the table. This requires some clever 
realization of the public table of random bits. A possibly realistic scenario 
(the “satellite scenario” ) is that the random bits are broadcast by some radio 
source (e.g. a satellite or a natural deep-space radio source) with a sufficiently 
high data rate (see the example below) and Eve can store only parts of it. 

So, we assume that there is a public source broadcasting truly (uniformly 
distributed) random bits to Alice, Bob and Eve at a high speed. The com- 
munication channels are assumed to be error free. 

Alice and Bob select their key bits from the bit stream over some time pe- 
riod 7’, according to some private strategy not known to Eve. The ciphertext 
is transmitted later, after the end of T’. Due to her limited storage resources, 
Eve can store only a small part of the bits broadcast during T’. 
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To extract the one-time key k for a message m € {0,1}", Alice and Bob 
synchronize on the source and listen to the broadcast bits over the time period 
T. Let R (called the randomizer) be the random bits transmitted during T, 
and let (ri; | 1<i<1,0 <j <t—1) be these bits arranged as elements of a 
matrix with | rows and t columns. Thus, R contains |R| = lt bits. Typically, 
1 is small (about 50) and ¢ is huge, even when compared with the length n of 
the plaintext message m. 

Alice and Bob have agreed on a (short) private key (s1,..., 57) in ad- 
vance, with s; randomly chosen in {0,...,¢—1} (with respect to the uniform 
distribution), and take as key k := kg... ky_1, with 


kj °=11,(s, 47) mod t © 12,(s2+7) mod t +++ ® 11,(s; +7) mod t+ 


In other words, Alice and Bob select from each row i the bit string 6; of length 
n, starting at the randomly chosen “seed” s; (jumping back to the beginning 
if the end of the row is reached) and then get their key k by XORing these 
strings b;. 

Attacker Eve also listens to the random source during T and stores some 
of the bits, hoping that this will help her to extract information when the 
encrypted message is transmitted after the end of J. Due to her limited 
storage space, she stores only q of the bits r;,;. The bits are selected by some 
probabilistic algorithm.” For each of these q bits, she knows the value and 
possibly the position in the random bit stream R. Eve’s knowledge about 
the randomizer bits is summarized by the random variable S. You may also 
consider S as a probabilistic algorithm, returning q positions and bit values. 
As usual in the study of a one-time pad encryption scheme, Eve may know 
the distribution on M. We assume that she has no further a priori knowledge 
about the messages m € M actually sent to Bob by Alice. The following 
theorem is proven in [Maurer92]. 


Theorem 9.21. There exists an event E, such that for all probabilistic strate- 
gies S for storing q bits of the randomizer R 


I(M;CS|E) =0 and prob(€) > 1— no, 
where 6 := 4/|R| is the fraction of randomizer bits stored by Eve. 


Proof. We sketch the basic idea and refer the interested reader to [Maurer92]. 
Let (s1,..., 87) be the private key of Alice and Bob, and k := ko... kn—1 be 
the key extracted from R by Alice and Bob. Then bit k; is derived from R 
by XORing the bits 71 (5,43) mod t:72,(s2-+j) mod t+ ++>T1,(s;-+j) mod t- 

If Eve’s storage strategy missed only one of these bits, then the result- 
ing bit k; appears truly random to her, despite her knowledge S of the 
randomizer. The probability of the event F that Eve stores with her strat- 
egy all the bits ry (5; +5) mod t:12,(s2-+j) mod ts +++5Tl,(s;-+j) mod ty for at least one 
j,0< 7 <n-—1, is very small — it turns out to be < nd!. 


” Note that there are no restrictions on the computing power of Eve. 
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The “security event” € is defined as the complement of ¥. If € occurs, 
then, from Eve’s point of view, the key extracted from R by Alice and Bob 
is truly random, and we have the situation of Vernam’s one-time pad, with 
a mutual information which equals 0 (see Theorem 9.5). 


Remarks: 


1. The mutual information in Theorem 9.21 is conditional on an event €. 
This means that all the entropies involved are computed with the condi- 
tional probabilities assuming € (see the final remark in Appendix B.4). 

2. In [Maurer92] a stronger version is proven. It also includes the case that 
Eve has some a priori knowledge V of the plaintext messages, where V is 
jointly distributed with M. Then the mutual information I(M, C'S|V, €) 
between M and C'S, conditioned over V and assuming €, is 0. Condi- 
tioning over V means that the mutual information does not include the 
amount of information about M resulting from the knowledge of V (see 
Proposition B.36). 

3. Adversary Eve cannot gain an advantage from learning the secret key 
(s1,..., $7) after the broadcast of the randomizer R. Therefore, Theorem 
9.23 is a statement on everlasting security. 

4. The model of attack applied here is somewhat restricted. In the first 
phase, while listening to the random source, the eavesdropper Eve does 
not exploit her full computing power; she simply stores some of the trans- 
mitted bits and does not use the bit stream as input for computations 
at that time. The general model of attack, where Eve may compute and 
store arbitrary bits of information about the randomizer, is considered 
below. 


Example. This example is derived from a similar example in [CachMau97]. A 
satellite broadcasting random bits at a rate of 16 Gbit/s is used for one day 
to provide a randomizer table R with about 1.5-10!° bits. Let R be arranged 
in J := 100 rows and t := 1.5- 10! columns. Let the plaintexts be 6 MB, i.e., 
n 5-10" bits. Alice and Bob have to agree on a private key (s1,..., 81) of 
100 - logs(1.5 - 1013) = 4380 bits. The storage capacity of the adversary Eve 
is assumed to be 100 TB, which equals about 8.8 - 1014 bits. Then 6 = 0.587 
and 
prob(not €) < 5-10" - 0.5871 = 3.7-10716 < 107%. 


Thus, the probability that Eve gets any additional information about the 
plaintext by observing the ciphertext and applying an optimal storage strat- 
egy is less than 107)”. 

Theorem 9.21 may also be interpreted in terms of distinguishing algo- 
rithms (see Proposition 9.10). We denote by E the probabilistic encryption 
algorithm which encrypts m as c:= m@k, with k randomly chosen as above. 


Theorem 9.22. For every probabilistic storage strategy S storing a frac- 
tion 6 of all randomizer bits, and every probabilistic distinguishing algorithm 
A(mo,™m1,¢, 8) and all mo,m, € M, with m9 4m, 
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prob(A(mpg,m4,¢,s) =m:m = {mo,m1},¢ — E(m),s — S$) < ; +né'. 
Remark. This statement is equivalent to 
| prob(A(mo, m1, ¢, §) = mo: c— E(mo), 8 — S) 
— prob(A(mo, m1, ¢, s) = mo : c— E(m),s — S)| < nd", 
as the same computation as in the proof of Proposition 9.10 shows. 
Proof. From Theorem 9.21 and Proposition B.32, we conclude that 
prob(c, s|mo,€) = prob(c, s|m1, €) 


for all mp,m, € M, c€ C and s € S. Computing with probabilities condi- 
tional on €, we get 


prob(A(mo, m1, ¢, s) . mo|E 7 C— Emo), § pay S) 
= S © prob(e, s|mo,€) - prob(A(mo, m1, ¢, 8) = mo |E) 
— S © prob(e, s|m,,€) - prob(A(mo, m1, ¢, S) = mo |E) 


= prob(A(mo, m1, ¢, 8) = mo|E:c— E(m),s — S). 
Using Lemma B.9 we get 


|prob(A( trig, 11, ¢,8) = mg :¢— E(t), 9 8) 
— prob(A(mo,m1,¢, 8) = mo: ¢— E(m1),s — S)| 
< prob(not €) < ndé!, 


as desired. 


As we observed above, the model of attack just used is somewhat re- 
stricted, because the eavesdropper Eve does not use the bit stream as input for 
computations in the first phase — while listening to the random source. Secu- 
rity proofs for the general model of attack, where Eve may use her (unlimited) 
computing power at any time — without any restrictions — were given only 
recently in a series of papers ([AumRab99]; [AumDinRab02]; [DinRab02); 
[DziMau02]; [Lu02]; [Vadhan03]). 

The results obtained by Aumann and Rabin ({AumRab99]) were still re- 
stricted. The randomness efficiency, i.e., the ratio 6 = WRI of Eve’s storage 
capacity and the size of the randomizer, was very small. 

Major progress on the bounded storage model was then achieved by Au- 
mann, Ding and Rabin ([AumDinRab02); [Ding01]; [DinRab02]) proving the 
security of schemes with a randomness efficiency of about 0.2. 

Strong security results for the general model of attack were shown by 
Dziembowski and Maurer in [DziMau02] (an extended version is [DziMau04a}). 
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They can prove that keys k, whose length n is much longer than the length 
of the initial private key, can be securely generated with a randomness ef- 
ficiency which may be arbitrarily close to 1. A randomness efficiency of 0.1 
is possible for reasonable parameter sizes, which appear possible in practice 
(for example, with attacker Eve’s storage capacity ~ 125 TB, a derived key 
k of 1 GB for the one-time pad, an initial private key < 1 KB and a statis- 
tical distance < 2~° between the distribution of the derived key k and the 
uniform distribution, from Eve’s point of view). 

The schemes investigated in the cited papers are all very similar to the 
scheme from [Maurer92], which we explained here. Of course, security proofs 
for the general model of attack require more sophisticated methods of prob- 
ability and information theory. 

In all of the bounded-storage-model schemes that we referred to above 
one assumes that Alice and Bob share an initial secret key s, usually without 
considering how such a key s is obtained by Alice and Bob. A natural way 
would be to exchange the initial key by using a public-key key agreement 
protocol, for example, the Diffie-Hellman protocol (see Section 4.1.2). At first 
glance, this approach may appear useless, since the information-theoretic 
security against a computationally unbounded adversary Eve is lost — Eve 
could break the public-key protocol with her unlimited resources. However, 
if Eve is an attacker, who gains her infinite computing power (and then the 
initial secret key) only after the broadcast of the randomizer, then the security 
of the scheme might be preserved (see [DziMau04b] for a detailed discussion). 

In [CachMau97], another approach is discussed. In a variant of their 
scheme, the key k for the one-time pad is generated within the bounded 
storage model, and Alice and Bob need not share an initial secret key s. The 
general model of attack is applied in [CachMau97] — adversary Eve may use 
her unlimited computing power at any time’®. 

Some of the techniques used there are basic.? To illustrate these tech- 
niques, we give a short overview about parts of [CachMau97]. 

As before, there is some source for truly random bits. Alice, Bob and 
Eve receive these random bits over perfect channels without any errors. We 
are looking for bit sequences of length n to serve as keys in a one-time pad 
for encrypting messages m € {0,1}". The random bit source generates N 
bits R := (r1,...,rn). The storage capacity gq of Eve is smaller than N, so 
she is not able to store the whole randomizer. In contrast to the preceding 
model, she not only stores q bits of the randomizer, but also executes some 
probabilistic algorithm U while listening to the random source, to compute 
q bits of information from R (and store them in her memory). As before, we 


8 Unfortunately, the arising schemes are either impractical or attacker Eve’s prob- 
ability of success is non-negligible, see below. 

° They are also applied, for example, in the noisy channel model, which we discuss 
in Section 9.6.2. 
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denote by 6 := 4/N the fraction of Eve’s storage capacity with respect to the 
total number of randomizer bits. 

In a first phase, called advantage distillation, Alice and Bob extract suf- 
ficiently many, say 1, bits S := (s1,...,8;) from R, at randomly chosen posi- 
tions P := (p,,...,p1): 


81 I= 1p.) 82 = Tpgs+++) 81 = Tp: 


It is necessary that the positions are chosen pairwise independently. The 
positions p,,...,p, are kept secret, until the broadcast of the randomizer is 
finished. 

Alice and Bob can select the bits in two ways. 


1. Private key scenario: Alice and Bob agree on the positions pj,...,p; 
in advance and share these positions as an initial secret key. 

2. Key agreement solely by public discussion: Independently, Alice 
and Bob each select and store w bits of the randomizer R. The positions 
of the selected bits t1,...,ty and ui,...,Uw are randomly selected, pair- 
wise independent and uniformly distributed. When the broadcast of the 
randomizer is finished, Alice and Bob exchange the chosen positions over 
the public channel. Let {p1,...,p,} = {t1,..-,tw}A {u1,...,Uw}. Then, 
Alice and Bob share the | randomizer bits at the positions p1,..., pj. It is 
easy to see that the expected number / of common positions is | = w/ N 
(see, for example, Corollary B.17). Hence, on average, they have to select 
and store VIN randomizer bits to obtain | common bits. 


Since Eve can store at most q bits, her information about S$ is incom- 
plete. For example, it can be proven that Eve knows at most a fraction 6 
of the / bits in S (in the information-theoretic sense). Thus, Alice and Bob 
have distilled an advantage. Let e be the integer part of Eve’s uncertainty 
H(S|Eve’s knowledge) about S. Then Eve lacks approximately e bits of in- 
formation about S. 

In a second phase, Alice and Bob apply a powerful technique, called pri- 
vacy amplification or entropy smoothing, to extract f bits from S in such a 
way that Eve has almost no information about the resulting string S. Here, 
f is given by the Rényi entropy of order 2 (see below). Since this entropy is 
less than or equal to Shannon’s entropy, we have f < e. Eve’s uncertainty 
about S' is close to f, so from Eve’s point of view, S appears almost truly 
random. Thus, it can serve Alice and Bob as a provably secure key & in a 
one-time pad. 

Privacy amplification is accomplished by randomly selecting a member 
from a so-called universal class of hash functions (see below). Alice randomly 
selects an element h from such a universal class H (with respect to the uniform 
distribution) and sends h to Bob via a public channel. Thus, Eve may even 
know H and h. Alice and Bob both apply h to S in order to obtain their key 
k, := h(S) for the one-time pad. 
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Let H and K be the random variables describing the probabilistic choice 
of the function h and the probabilistic choice of the key k. The following 
theorem is proven in [CachMau97]. 


Theorem 9.23. Given a fixed storage capacity q of Eve and €1,€2 > 0, there 
exists a security event E such that 


prob(€) >1—« and I(K; H|U =u,P=p,€) <e, 
and hence in particular 
I(K;UHP|E) < €& and I(K;UH|E) < 9, 


provided the size N of the randomizer R and the number | of elements selected 
from R by S are sufficiently large. 


Remarks: 


1. Explicit formulae are derived in [CachMau97] which connect the bounds 
€1, €2, the size N of the randomizer, Eve’s storage capacity q, the number 
! of chosen positions and the number f of derived key bits. 

2. The third inequality follows from the second by Proposition B.35, and 
the fourth inequality from the third by Proposition B.36 (also observe 
the final remark in Appendix B.4). 

3. I(K;H|U = u,P = p,€) < e means the following. Assume Eve has 
the specific knowledge U = u about the randomizer and has learned 
the positions P of the bits selected from R by Alice and Bob after the 
broadcast of the randomizer R. Then the average amount of information 
(measured in bits in the information-theoretic sense) that Eve can derive 
about the key & from learning the hash function h is less than eg, provided 
the security event E€ occurs. 

Thus, in the private key scenario, the bound €2 also holds if the secret key 
shared by Alice and Bob (i.e., the positions of the selected randomizer 
bits) is later compromised. The security is everlasting. 


As we mentioned above, a key step in the proof of Theorem 9.23 is privacy 
amplification to transform almost all the entropy of a bit string into a random 
bit string. For this purpose, it is not sufficient to work with the classical 
Shannon entropy as defined in Appendix B.4. Instead, it is necessary to use 
more general information measures: the Rényi entropies of order a (O0<a< 
oo, see [Rényi61]; [Rényi70]). Here, in particular, the Rényi entropy of order 
2 — also called collision entropy — is needed. 


Definition 9.24. Let S be a random variable with values in the finite set S. 
The collision probability prob,(.S) of S' is defined as 


prob,($) := S > prob($ Ss)"; 


sES 
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The collision entropy or Rényi entropy (of order 2) of S is 


H2(S') := — log, (prob,(')) = — log, (= prob(S = *) ; 


sES 


prob,($) is the probability that two independent executions of S' yield the 
same result. H2(S) measures the uncertainty that two independent executions 
of the random experiment S yield the same result. 

The mathematical foundation of privacy amplification is the Smoothing 
Entropy Theorem. It states that almost all the collision entropy of a random 
variable S may be converted into uniform random bits by selecting a function 
h randomly from a universal class of hash functions and applying h to S (see, 
e.g., [Luby96], Lecture 8). Universal classes of hash functions were introduced 
by Carter and Wegman ([CarWeg79]; [WegCar81]). 


Definition 9.25. A set H of functions h : X —= Y is called a universal class 
of hash functions if for all distinct 71,79 € X, 


1 


prob(h(21) = h(a) :h “ H) = nak 


H is called a strongly universal class of hash functions if for all distinct 
21,22 € X and all (not necessarily distinct) y1, yo € Y, 


1 


prob(h(a1) = y1,h(@2) = y2:h“& H) = VE 


In particular, a strongly universal class is also universal. (Strongly) univer- 
sal classes of hash functions behave like completely random functions with 
respect to collisions (or value pairs). 


Example. A straightforward computation shows that the set of linear map- 
pings {0,1}! —> {0,1}/ is a strongly universal class of hash functions. There 
are smaller classes ([Stinson92]). For example, the set 


H := {hao,a, : Foi —> Fos,  —> msbe(a0-@ + a1) | ao, a1 € Fa} 
is strongly universal (1 > f), and the set 
H := {ha : Fo —> For, x > msbe(a- x) | a € Fa} 


is universal. Here, we consider {0,1} as equipped with the structure Fom of 
the Galois field with 2” elements (see Appendix A.5), and msby denotes the 
f most-significant bits. See Exercise 8. 


Remark. In the key generation scheme discussed, Alice and Bob select w (or 
1) bits at pairwise independent random positions from the N bits broadcast 
by the random source. They have to store the positions of these bits. At first 
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glance, w-log,(NV) bits are necessary to describe w positions. Since w is large, 
a huge number of bits have to be stored and transferred between Alice and 
Bob. Strongly universal classes of hash functions also provide a solution to 
this problem. 

Assume N = 2™, and consider {0,1} as Fgm. Alice and Bob may work 
with the strongly universal class 


H = {h: Fom —> Fom, u+—> ag: & + a, | ag, ay € Fom} 


of hash functions. They fix pairwise different elements 71,...,%  € Fom in 
advance. 7 and the x; may be known to Eve. Now, to select w positions 
randomly — uniformly distributed and pairwise independent — Alice or Bob 
randomly chooses some h from H (with respect to the uniform distribution) 
and applies h to the x;. This yields w uniformly distributed and pairwise 
independent positions 


h(a1), h(aw2),...,h(tw). 


Thus, the random choice of the w positions reduces to the random choice of 
an element h in H, and this requires the random choice of 2m = 2log,(NV) 
bits. 


Example. Assume that Alice and Bob do not share an initial private key 
and the key is derived solely by a public discussion. Using the explicit for- 
mulae, you get the following example for the Cachin-Maurer scheme (see 
[CachMau97] for more details). A satellite broadcasting random bits at a 
rate of 40 Gbit/s is used for 2- 10° seconds (about 2 days) to provide a 
randomizer R with about N = 8.6-10!° bits. The storage capacity of the 
adversary Eve is assumed to be 1/9 PB, which equals about 4.5 - 10!° bits. 
To get | = 1.3- 10" common positions and common random bits, Alice and 
Bob each have to select and store w = VIN = 3.3-10!! bits (or about 39 
GB) from R. By privacy amplification, they get a key & of about 61 KB and 
Eve knows not more than 10~2° bits of k, provided that the security event 
E occurs. The probability of € is > 1 — €, with e, = 107°. Since | is of the 
order of 1/,2, the probability that the security event € does not occur can 
not be reduced to significantly smaller values, without increasing the storage 
requirements for Alice to unreasonably high values. To choose the w positions 
of the randomizer bits which they store, Alice and Bob each randomly select a 
strongly universal hash function (see the preceding remark). So, to exchange 
these positions, Alice and Bob have to transmit a strongly universal hash 
function in each direction, which requires 2log,(N) ~ 106 bits. For privacy 
amplification, either Alice or Bob chooses the random universal hash function 
and communicates it, which takes about | bits = 1.5 MB. The large size of 
the hash functions may be substantially reduced by using “almost universal” 
hash functions. 
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Remarks: 


1. Alice and Bob need — as the example demonstrates — a very large capacity 
for storing the positions and the values of the randomizer bits, and the 
size of this storage rapidly increases if the probability of the security 
event € is required to be significantly closer to 1 than 107°, which is 
certainly not negligible. To restrict adversary Eve’s probability of success 
to negligible values would require unreasonably high storage capacities 
of Alice and Bob. This is also true for the private-key scenario. 

2. If the key k is derived solely by a public discussion, then both Alice and 
Bob need storage on the order of VN, which is also on the order of NL 
(recall that N is the size of the randomizer and q is the storage size of 
the attacker). It is shown in [DziMau04b] that these storage requirements 
can not be reduced. The Cachin-Maurer scheme is essentially optimal in 
terms of the ratio between the storage capacity of Alice and Bob and 
the storage capacity of adversary Eve. The practicality of schemes in the 
bounded storage model which do not rely on a shared initial secret key 
is therefore highly questionable. 


9.6.2 The Noisy Channel Model 


An introduction to the noisy channel model is, for example, given in the 
survey article [Wolf98]. As before, we use the “satellite scenario”. Random 
bits are broadcast by some radio source. Alice and Bob receive these bits 
and generate a key from them by a public discussion. The eavesdropper, Eve, 
also receives the random bits and can listen to the communication channel 
between Alice and Bob. Again we assume that Eve is a passive adversary. 
There are other models including an active adversary (see, e.g., [Maurer97]; 
[MauWol97]). Though all communications are public, Eve gains hardly any 
information about the key. Thus, the generated key appears almost random 
to Eve and can be used in a provably secure one-time pad. The secrecy of 
the key is based on the fact that no information channel is error-free. The 
system also works in the case where Eve receives the random bits via a much 
better channel than Alice and Bob. 

The key agreement works in three phases. As in Section 9.6.1, it starts 
with advantage distillation and ends with privacy amplification. There is an 
additional intermediate phase called information reconciliation. 

During advantage distillation, Alice chooses a truly random key k, for ex- 
ample from the radio source. Before transmitting it to Bob, she uses a random 
bit string r to mask k and to make the transmission to Bob highly reliable 
(applying a suitable error detection code randomized by r). The random bit 
string r is taken from the radio source and commonly available to all par- 
ticipants. If sufficient redundancy and randomness is built in, which means 
that the random bit string r is sufficiently long, the error probability of the 
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adversary is higher than the error probability of the legitimate recipient Bob. 
In this way, Alice and Bob gain an advantage over Eve. 

When phase 1 is finished, the random string k held by Alice may still differ 
from the string k’ received by Bob. Now, Alice and Bob start information 
reconciliation and interactively modify k and k’, such that at the end, the 
probability that k 4 k’ is negligibly small. This must be performed without 
leaking too much information to the adversary Eve. Alice and Bob may, for 
example, try to detect differing positions in the string by comparing the parity 
bits of randomly chosen substrings. 

After phase 2, the same random string k of size / is available to Alice and 
Bob with a very high probability, and they have an advantage over Eve. Eve’s 
information about k, measured by Rényi entropies, is incomplete. Applying 
the privacy amplification techniques sketched in Section 9.6.1 Alice and Bob 
obtain their desired key. 


Exercises 


1. Let n € N. We consider the affine cipher modulo n. It is a symmetric 
encryption scheme. A key (a,b) consists of a unit a € Z* and an element 
be Z,. A message m € Z, is encrypted as a-m-+b. 

Is the affine cipher perfectly secret if we randomly (and uniformly) choose 
a key for each message m to be encrypted? 


2. Let E be an encryption algorithm, which encrypts plaintexts m © M 
as ciphertexts c € C,, and let K denote the secret key used to decrypt 
ciphertexts. 

Show that an adversary’s uncertainty about the secret key is at least as 
great as her uncertainty about the plaintext: H(k|C) > H(M|C). 


3. ElGamal’s encryption (Section 3.5.1) is probabilistic. Is it computation- 
ally secret? 


4. Consider Definition 9.14 of computationally secret encryptions. Show 
that an encryption algorithm E(i,m) is computationally secret if and 
only if for every probabilistic polynomial distinguishing algorithm 
A(i,mo,7™1,¢) and every probabilistic polynomial sampling algorithm 
S, which on input 7 € I yields S(t) = {mo,mi} C Mj, and every positive 
polynomial P € Z[X], there is a kg € N such that for all k > ko, 


prob(A(i,mo, m1, ¢) = mo :i — K(1*), {mo,mi} — S(i),¢ — E(mo)) 

— prob(A(i,mo,m1,c) = mo : i — K(1"), {mo, mi} — S(i),¢ — E(m1)) 
1 

oe . 


(k) 


262 


5. 


8. 
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Let Ip := {(n,e) | n = pq, p,q distinct primes, |p| = |q| = k,e € Zins 
and (n,e) “ Ij, be a randomly chosen RSA key. Encrypt messages m € 
{0,1}", where r < log,(|n|), in the following way. Pad m with leading 
random bits to get a padded plaintext m with |m| = |n|. Ifm > n, then 
repeat the padding of m until ™ <n. Then encrypt m as E(m) :=c¢:= 
m° mod n. 

Prove that this scheme is computationally secret. 

Hint: use Exercise 7 in Chapter 8. 


Let I = ([g)ken be a key set with security parameter, and let f = 
(f; : Dj —> Dj)icr be a family of one-way trapdoor permutations with 
hard-core predicate B = (B;: D; —> {0,1})ier and key generator K. 
Consider the following probabilistic public-key encryption scheme 

([GolMic84]; [GolBel01]): Let Q be a polynomial and n := Q(k). A 


bit string m := m,...m,y is encrypted as a concatenation cj]... ||cn, 
where c; := fi(a;) and x; is a randomly selected element of D; with 


Describe the decryption procedure and show that the encryption scheme 
is computationally secret. 

Hints: Apply Exercise 4. Given a pair {mo,m1} of plaintexts, construct 
a sequence of messages 1 := Mo,™M2,...,7%n = m1, such that mj+41 
differs from m,; in at most one bit. Then consider the sequence of distri- 
butions c — E(i,m,;) (also see the proof of Proposition 8.4). 


We consider the Goldwasser-Micali probabilistic encryption scheme 
([GolMic84]). Let I, := {n | n = pq,p,q distinct primes , |p| = |q| = &} 
and I := (I,)xen. As his public key, each user Bob randomly chooses an 
n “ I, (by first randomly choosing the secret primes p and q) and a 
quadratic non-residue z “ QNR,, with Jacobi symbol (2) = 1 (he can 
do this easily, since he knows the primes p and q; see Appendix A.6). 
A bit string m = m1...m,, is encrypted as a concatenation cj]... ||cn, 
where cy = x7 ifm; = 1, and c; = za7 if mj = 0, with a randomly chosen 
Lj “ Z*. In other words: A 1 bit is encrypted by a random quadratic 
residue, a 0 bit by a random non-residue. 

Describe the decryption procedure and show that the encryption scheme 
is computationally secret, provided the quadratic residuosity assumption 
(Definition 6.11) is true. 

Hint: The proof is similar to the proof of Exercise 6. Use Exercise 9 in 
Chapter 6. 


Let 1 > f. Prove that: 
a. H := {ha : Fo — For, c+ msby(a- a) | a € Fo} is a universal 
class of hash functions. 
b. Ho:= {Ray,a, 1 Fo: —> For, e -> msbe(ao- 4+ a1) | ao, a1 € Fo} is 
a strongly universal class of hash functions. 
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Here we consider {0, 1} as equipped with the structure Fam of the Galois 
field with 2™ elements. As before, msby denotes the f most-significant 
bits. 


10. Provably Secure Digital Signatures 


In previous sections, we discussed signature schemes (Full-Domain-Hash RSA 
signatures and PSS in Section 3.4.5; the Fiat-Shamir signature scheme in 
Section 4.2.5) that include a hash function h and whose security can be 
proven in the random oracle model. It is assumed that the hash function 
h is a random oracle, i.e., it behaves like a perfectly random function (see 
Sections 3.4.4 and 3.4.5). Perfectly random means that for all messages m, 
each of the k bits of the hash value h(m) is determined by tossing a coin, 
or, equivalently, that the map h: X —> Y is randomly chosen from the set 
F(X,Y) of all functions from X to Y. In general, F(X, Y) is tremendously 
large. For example, if X = {0,1}" and Y = {0,1}*, then |F(X,Y)| = 2*?". 
Thus, it is obvious that perfectly random oracles cannot be implemented. 

Moreover, examples of cryptographic schemes were constructed that are 
provably secure in the random oracle model, but are insecure in any real-world 
implementation, where the random oracle is replaced by a real hash function. 
Although these examples are contrived, doubts on the random oracle model 
arose (see the remark on page 244 in Section 9.5). 

Therefore, it is desirable to have signature schemes whose security can 
be proven solely under standard assumptions (like the RSA or the discrete 
logarithm assumption). Examples of such signature schemes are given in this 
chapter. 


10.1 Attacks and Levels of Security 


Digital signature schemes are public-key cryptosystems and are based on the 
one-way feature of a number-theoretical function. A digital signature scheme, 
for example the basic RSA signature scheme (Section 3.3.2) or ElGamal’s 
signature scheme (Section 3.5.2), consists of the following: 


1. A key generation algorithm K, which on input 1” (k being the security 
parameter) produces a pair (pk, sk) consisting of a public key pk and a 
secret (private) key sk. 

2. A signing algorithm $(sk,m), which given the secret key sk of user Alice 
and a message m to be signed, generates Alice’s signature o for m. 
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3. A verification algorithm V(pk,m,o), which given Alice’s public key pk, 
a message m and a signature a, checks whether o is a valid signature of 
Alice for m. Valid means that o might be output by S(sk,m), where sk 
is Alice’s secret key. 


Of course, all algorithms must be polynomial. The key generation algorithm 
K is always a probabilistic algorithm. In many cases, the signing algorithm 
is also probabilistic (see, e.g., ElGamal or PSS). The verification algorithm 
might be probabilistic, but in practice it usually is deterministic. 

As with encryption schemes, there are different types of attacks on signa- 
ture schemes. We may distinguish between (see [GolMicRiv88]): 


1. Key-only attack. The adversary Eve only knows the public key of the 
signer Alice. 

2. Known-signature attack. Eve knows the public key of Alice and has seen 
message-signature pairs produced by Alice. 

3. Chosen-message attack. Eve may choose a list (m,...,mz) of messages 
and ask Alice to sign these messages. 

4. Adaptively-chosen-message attack. Eve can adaptively choose messages 
to be signed by Alice. She can choose some messages and gets the cor- 
responding signatures. Then she can do cryptanalysis and, depending 
on the outcome of her analysis, she can choose the next message to be 
signed, and so on. 


Adversary Eve’s level of success may be described in increasing order as (see 
[GolMicRiv88}): 


1. Existential forgery. Eve is able to forge the signature of at least one 
message, not necessarily the one of her choice. 

2. Selective forgery. Eve succeeds in forging the signature of some messages 
of her choice. 

3. Universal forgery. Although unable to find Alice’s secret key, Eve is able 
to forge the signature of any message. 

4. Retrieval of secret keys. Eve finds out Alice’s secret key. 


As we have seen before, signatures in the basic RSA, ElGamal and DSA 
schemes, without first applying a suitable hash function, can be easily exis- 
tentially forged using a key-only attack (see Section 3). In the basic Rabin 
scheme, secret keys may be retrieved by a chosen-message attack (see Section 
3.6.1). We may define the level of security of a signature scheme by the level 
of success of an adversary performing a certain type of attack. Different levels 
of security may be required in different applications. 

In this chapter, we are interested in signature schemes which provide the 
maximum level of security. The adversary Eve cannot succeed in an existen- 
tial forgery with a significant probability, even if she is able to perform an 
adaptively-chosen-message attack. As usual, the adversary is modeled as a 
probabilistic polynomial algorithm. 
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Definition 10.1. Let D be a digital signature scheme, with key generation 
algorithm K, signing algorithm S and verification algorithm V. An ezis- 
tential forger F for D is a probabilistic polynomial algorithm F that on 
input of a public key pk outputs a message-signature pair (m,o) := F(pk). 
F is successful on pk if o is a valid signature of m with respect to pk, i.e., 
V (pk, F(pk)) = accept. F' performs an adaptively-chosen-message attack if, 
while computing F (pk), F can repeatedly generate a message m and then is 
supplied with a valid signature o for m. 


Remarks. Let F be an existential forger performing an adaptively-chosen- 
message attack: 


1. Let (pk,sk) be a key of security parameter k. Since the running time 
of F'(pk) is bounded by a polynomial in & (note that pk is generated in 
polynomial time from 1*), the number of messages for which F' requests 
a signature is bounded by T(k), where T is a polynomial. 

2. The definition leaves open who supplies F’ with the valid signatures. If 
F is used in an attack against the legitimate signer, then the signatures 
6 are supplied by the signing algorithm S, S(sk,m) =o, where sk is the 
private key associated with pk. In a typical security proof, the signatures 
are supplied by a “simulated signer”, who is able to generate valid sig- 
natures without knowing the trapdoor information that is necessary to 
derive sk from pk. This sounds mysterious and impossible. Actually, for 
some time it was believed that a security proof for a signature scheme is 
not possible, because it would necessarily yield an algorithm for invert- 
ing the underlying one-way function. However, the security proof given 
by Goldwasser, Micali and Rivest for their GMR scheme (discussed in 
Section 10.3) proved the contrary. See [GolMicRiv88], Section 4: The 
paradox of proving signature schemes secure. The key idea for solving 
the paradox is that the simulated signer constructs signatures for keys 
whose form is a very specific one, whereas their probability distribution 
is the same as the distribution of the original keys (see, e.g., the proof of 
Theorem 10.12). 

3. The signatures o;,1 <i < T(k), supplied to F are, besides pk, inputs to 
F. The messages m;,1 <i < T(k), for which F’ requests signatures, are 
outputs of F’. Let M; be the random variable describing the 7-th message 
m,. Since F' adaptively chooses the messages, message m; may depend on 
the messages m, and the signatures 0; supplied to F for mj;,1 <j <i. 
Thus M; may be considered as a probabilistic algorithm with inputs pk 
and (ins Os) Tepei 
The probability of success of F' for security parameter k is then computed 
as! 

1 Unless otherwise stated, we always mean Fs probability of success when the 


signatures for the adaptively chosen messages are supplied by the legitimate 
signer S. 
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prob(V (pk, F(pk, (oi)1<i<r(ny)) = accept : (pk, sk) — K(1*), 
Mm; — M,(pk, (ig, Op) 1 3e8) 0% —_ S(sk,m,),1 < 1 < T(k)). 


Definition 10.2. A digital signature scheme is secure against adaptively- 
chosen-message attacks if and only if for every existential forger F' performing 
an adaptively-chosen-message attack and every positive polynomial P, there 
is a kg © N such that for all security parameters k > ko, the probability of 
success of F' is < 1/P(k). 


Remark. Fail-stop signature schemes provide an additional security feature. 
If a forger — even if he has unlimited computing power and can do an ex- 
ponential amount of work — succeeds in generating a valid signature, then 
the legitimate signer Alice can prove with a high probability that the signa- 
ture is forged. In particular, Alice can detect forgeries and then stop using 
the signing mechanism (“fail then stop”). The signature scheme is based on 
the assumed hardness of a computational problem, and the proof of forgery 
is performed by showing that this underlying assumption has been compro- 
mised. Fail-stop signature schemes were introduced by Waidner and Pfitz- 
mann ([WaiPfi89]). We do not discuss fail-stop signatures here (see, e.g., 
[Pfitzmann96]; [MenOorVan96]; [Stinson95]; [BarPfi97]). 


10.2 Claw-Free Pairs and Collision-Resistant Hash 
Functions 


In many digital signature schemes, the message to be signed is first hashed 
with a collision-resistant hash function. Provably collision-resistant hash func- 
tions can be constructed from claw-free pairs of trapdoor permutations. In 
Section 10.3 we will discuss the GMR signature scheme introduced by Gold- 
wasser, Micali and Rivest. It was the first signature scheme that was provably 
secure against adaptively-chosen-message attacks (without depending on the 
random oracle model), and it is based on claw-free pairs. 


Definition 10.3. Let fo : D — D and f; : D — D be permutations of the 
same domain D. A pair (x,y) is called a claw of fo and fi if fo(x) = fil(y). 


Let I = (Ig)ken be a key set with security parameter k. We consider 
families 
fo = (foi : Di — Didier, fi = (fii: Di — Didier 
of one-way permutations with common key generator K that are defined on 
the same domains. 


Definition 10.4. (fo, f1) is called a claw-free pair of one-way permutations 
if it is infeasible to compute claws; i.e., for every probabilistic polynomial 
algorithm A which on input i outputs distinct elements x,y € D;, and for 
every positive polynomial P, there is a kg € N such that for all k > ko 
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. 1 
prob( fo,i() = fia(y) : i — K(1*), {x,y} — A(i)) < P(k) 
Claw-free pairs of one-way permutations exist, if, for example, factoring 
is hard. 


Proposition 10.5. Let I := {n | n = pq, p,q primes,p = 3mod8,q = 
7 mod 8}. If the factoring assumption (Definition 6.9) holds, then 


CQ:= (Fis Gis : QR, =< OR? ners 


where frn(x) := x? and gn(ax) := 4x7, is a claw-free pair of one-way permuta- 
tions (with trapdoor). 


Proof. Let n = pq € I. Since both primes, p and gq, are congruent 3 modulo 
4, fy is a permutation of QR,, (Proposition A.66). Four is a square and it is 
a unit in Z,. Thus, gy, is also a permutation of QR,,. Now, let x,y € QR,, 
with 2? = 4y? mod n be a claw. From n = 1 mod4 and n = —3 mod 8, 
we conclude (=) = 1 and (2) = —1 (Theorem A.57). We get (#22) - 
#1) .(2).(4) =1-(-1)-1 = —1, whereas (#) = 1 since z is a square. Thus 
x # +2y, and the Euclidean algorithm yields a factorization of n. Namely, 
0 = x? —4y? = (x—2y)(x+2y) = 0 and thus ged(x? — 4y?, n) is a non-trivial 
divisor of n. We see that an algorithm which, with some probability, finds 
claws of CQ yields an algorithm factoring n with the same probability, which 


is a contradiction to the factoring assumption. 


—— 


Claw-free pairs of one-way permutations can be used to construct collision- 
resistant hash functions. 


Definition 10.6. Let J = (I,)zen be a key set with security parameter k, 
and let K be a probabilistic polynomial sampling algorithm for J, which on 
input 1” outputs a key i € I;,. Let k(i) be the security parameter of i (i.e., 
k(i) =k for i € I,), and g : NN be a polynomial function. A family 


H= (hi : {0, 1}? 7? {0, 1}9hO)) 


of hash functions is called a family of collision-resistant (or collision-free) 
hash functions (or a collision-resistant hash function for short) with key gen- 
erator K, if: 


1. The hash values h;(x) can be computed by a polynomial algorithm H 
with inputs 7 € J and x € {0,1}*. 

2. It is computationally infeasible to find a collision; i.e., for every proba- 
bilistic polynomial algorithm A which on input i € J outputs messages 
mo,m, € {0,1}*,mo 4 m4, and for every positive polynomial P, there 
is a kg € N such that 


prob Un One Sh Onna aM ane ae at 


for all k > ko. 
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Remark. Collision-resistant hash functions are one way (see Exercise 2). 


Let (fo, f1) be a pair of one-way permutations, as above. For every m € 
{0,1}*, we may derive a family fm = (fm: Di —> Di)ier as follows. For 
m:=m,...™m € {0,1}! and x € D;, let 


fim i(®) °= fms i Pmaal- ++ fn i(@) ---))- 
If m € {0,1}* is the concatenation m := m||m2 of strings m, and ma, then 
obviously fm i(2) = Sma il fms,i(Z))- 
This family may now be used to construct a family H = (h;)j;¢7 of hash 
functions. Let J, := {(i,2) | 4 © Ip, v € Di} and J := Uxen Je. We define 


F;(m) := fm (a) € D; for j = (i,z) € J and me {0,1}*. 


Our goal is a collision-resistant family of hash functions H. To achieve this, 
we have to modify our construction a little and first encode our messages m 
in a prefix-free way. Let [m] denote a prefix-free (binary) encoding of the 
messages m in {0,1}*. Prefix-free means that no encoded [m] appears as a 
prefix? in the encoding [m’] of an m’ 4 m. For example, we might encode 
each 1 by 1, each 0 by 00 and terminate all encoded messages by 01.2 We 
define 

hj(m) := F;([m]), for m € {0,1}* and j € J. 

We will prove that H is a collision-resistant family of hash functions if the 
pair (fo, f1) is claw-free. Thus, we obtain the following proposition. 


Proposition 10.7. If claw-free pairs of one-way permutations exist, then 
collision-resistant hash functions also exist. 


Proof. Let H be the family of hash functions constructed above. Assume that 
H is not collision-resistant. This means that there is a positive polynomial P 
and a probabilistic polynomial algorithm A which on input j = (i,x) © Jy 
finds a collision {m,m’} of h; with non-negligible probability > 1/p(k) 
(where P is a positive polynomial) for infinitely many k. Collision means 
that fimjs(z) = fimya(a). Let [m] = mi...m, and [m’] = mj...m,, 
(mj,mi, € {0,1}), and let 1 be the smallest index u with m, # m,. 
Such an index / exists, since [m] is not a prefix of [m’], nor vice versa. 


We have fin,...m,i(2) = Tales hE); since fo; and f;,; are injective. Then 
(Fraigi..mri(®)s Fits... _i(@)) is a claw of (fo, fi). The binary lengths r 


and r’ of m and m’ are bounded by a polynomial in k, since m and m’ 
are computed by the polynomial algorithm A. Thus, the claw of (fo, f1) can 
be computed from the collision {m,m’} in polynomial time. Hence, we can 
compute claws with non-negligible probability > 1/ P(k), for infinitely many 
k, which is a contradiction. 


2 A string s is called a prefix of a string ¢ if t = ss’ is the concatenation of s and 
another string s’. 

3 Efficient prefix-free encodings exist, such that [m] has almost the same length as 
m (see, e.g., [BerPer85]). 
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Remarks: 


1. The constructed hash functions are rather inefficient. Thus, in practice, 
custom-designed hash functions such as SHA are used, whose collision 
resistance cannot be proven rigorously (see Section 3.4). 

2. Larger sets of pairwise claw-free one-way permutations may be used in 
the construction instead of one pair, for example sets with 2° elements. 
Then s bits of the messages m are processed in one step. There are larger 
sets of pairwise claw-free one-way permutations that are based on the 
assumptions that factoring and the computation of discrete logarithms 
are infeasible (see [Damgard87]). 

3. Another method of constructing provably collision-resistant hash func- 
tions is given in Exercise 8 in Chapter 3. It is based on the assumed 
infeasibility of computing discrete logarithms. 


10.3 Authentication-Tree-Based Signatures 


Again, we consider a claw-free pair (fo, f) of one-way permutations, as above. 
In addition, we assume that fp and f; have trapdoors. Such a claw-free pair of 
trapdoor permutations and the induced functions f;,, as defined above, may 
be used to generate probabilistic signatures. Namely, Alice randomly chooses 
some 7 € I (with a sufficiently large security parameter k) and some x € Dj, 
and publishes (7,2) as her public key. Then Alice, by using her trapdoor 
information, computes her signature o(i,xz,m) for a message m € {0,1}* as 


o(i,£,m) i= Fim a(®)s 


where [m] denotes some (fixed) prefix-free encoding of m. Bob can verify 
Alice’s signature o by comparing f[mj,;(o) with x. Since ff,),; is one way 
and m++ fimj,(a) is collision resistant, as we have just seen in the proof of 
Proposition 10.7, only Alice can compute the signature for m, and no one 
can use one signature o for two different messages m and m’. Unfortunately, 
this scheme is a one-time signature scheme. This means that only one mes- 
sage can be signed by Alice with her key (7,2); otherwise the security is not 
guaranteed.* If two messages m 4 m’ were signed with the same reference 
value x, then a claw of fp and f; can be easily computed from o(i,x,m) and 
a(i,x,m’) (see Exercise 5)°, and this can be a severe security risk. If we use, 
for example, the claw-free pair of Proposition 10.5, then Alice’s secret key 
(the factors of the modulus n) can be easily retrieved from the computed 
claw. 


“More examples of one-time signature schemes may be found, eg., in 
[MenOorVan96] and [Stinson95]. 

5 It is not a contradiction to the assumed claw-freeness of the pair that the claw 
can be computed, because here the adversary is additionally supplied with two 
signatures which can only be computed by use of the secret trapdoor information. 
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In the GMR signature scheme ([GolMicRiv88]), Goldwasser, Micali and 
Rivest overcome this difficulty by using a new random reference value for 
each message m. Of course, it is not possible to publish all these reference 
values as a public key in advance, so the reference values are attached to the 
signatures. Then it is necessary to authenticate these reference values, and 
this is accomplished by a second claw-free pair of trapdoor permutations. 
The GMR scheme is based on two claw-free pairs 


(fo, fa) = (fo,is fie : Di — Didier, (90,91) = (90,5,91,5 : By —> Ej)jes 


of trapdoor permutations, defined over I = (Ip)pen and J = (Jx)een. Each 
user Alice runs a key-generation algorithm K(1") to randomly choose an 
i © I, and an j € Jy (and generate the associated trapdoor information). 
Moreover, Alice generates a binary “authentication tree” of depth d. The 
nodes of this tree are randomly chosen elements in D;. Then Alice publishes 
(t,j,r) as her public key, where r is the root of the authentication tree. 
The authentication tree has 2¢ leaves w,1 < 1 < 2%. Alice can now sign 
up to 27 messages. To sign the I-th message m, she takes the /-th leaf 1 
and takes as the first part o1(m) of the signature the previously defined 
probabilistic signature Fim a(’) with respect to the reference value w.° The 
second part o2(m) of the signature authenticates v. It contains the elements 
Lo = 7,%1,...,€q := vi on the path from the root r to the leaf v in the 
authentication tree and authentication values for each node tm,1 <m < d. 
The authentication value of x, contains the parent x,,_; and both of its 
children cg and c, (one of them is z,,) and the signature Nealer|si ({m_—1) of 
the concatenated children with respect to the reference value 2,1, computed 
by the second claw-free pair. The children of a node are authenticated jointly. 
To verify Alice’s signatures, Bob has to climb up the tree from the leaf in 
the obvious way. If he finally computes the correct root r, he accepts the 
signature. 


Theorem 10.8. The GMR signature scheme is secure against adaptively- 
chosen-message attacks. 


Proof. See [GolMicRiv88]. 


Remarks: 


1. The full authentication tree and the authentication values for its nodes 
could be constructed in advance and stored. However, it is more efficient 
to develop it dynamically, as it is needed for signatures, and to store only 
the necessary information about its current state. 

2. The size of a GMR signature is of order O(kd) if the inputs of the claw- 
free pairs are of order O(k) (as in the pair given in Proposition 10.5). 
In practice, this size is considerable if the desired number n of signa- 
tures and hence d = log,(n) increases. For example, think only of 10000 


° Here we simplify a little and omit the “bridge items”. 
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signatures with a security parameter / = 1024. The size of the signa- 
tures and the number of computations of f — necessary to generate and 
to verify signatures — could be substantially reduced if authentication 
trees with much larger branching degrees could be used instead of the 
binary one, thus reducing the distance from a leaf to the root. Such sig- 
nature schemes have been developed, for example, by Dwork and Naor 
([DwoNao94]) and Cramer and Damgard ([CraDam96]). They are quite 
efficient and provably provide security against adaptively-chosen-message 
attacks if the RSA assumption 6.7 holds. For example, taking a 1024-bit 
RSA modulus, a branching degree of 1000 and a tree of depth 3 in the 
Cramer-Damgard scheme, Alice could sign up to 10° messages, with the 
size of each signature being less than 4000 bits. 


10.4 A State-Free Signature Scheme 


The signing algorithm in GMR or in other authentication-tree-based signa- 
ture schemes is not state free: the signing algorithm has to store the cur- 
rent state of the authentication tree which depends on the already gener- 
ated signatures, and the next signature depends on this state. In this sec- 
tion, we will describe a provably secure and quite efficient state-free digi- 
tal signature scheme introduced by Cramer and Shoup ({CraSho2000]). The 
scheme is secure against adaptively-chosen-message attacks, provided the so- 
called strong RSA assumption (see below) holds. Another state-free signature 
scheme based on the strong RSA assumption, has been, for example, intro- 
duced in [GenHalRab99]. The security proof, which we will give below, shows 
the typical features of such a proof. It runs with a contradiction: a successful 
forging algorithm is used to construct an attacker A who successfully inverts 
the underlying one-way function. The main problem is that in a chosen- 
message attack, the forger F' is only successful if he can request signatures 
from the legitimate signer. Now the legitimate signer, who uses his secret key, 
cannot be called during the execution of F’, because A is only allowed to use 
publicly accessible information. Thus, a major problem is to substitute the 
legitimate signer by a simulation. 

The moduli in the Cramer-Shoup signature scheme are defined with spe- 
cial types of primes. 


Definition 10.9. A prime p is called a Sophie Germain prime if 2p + 1 is 
also a prime.” 


Remark. In the Cramer-Shoup signature scheme we have to assume that suf- 
ficiently many Sophie Germain primes exist. Otherwise there is no guarantee 
that keys can be generated in polynomial time. The security proof given below 


” Sophie Germain (1776-1831) proved the first case of Fermat’s Last Theorem for 
prime exponents p, for which 2p + 1 is also prime ([Osen74}). 
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also relies on this assumption (see Lemma 10.11). There must be a positive 
polynomial P, such that the number of k-bit Sophie Germain primes p is 
> 2*/ P(k)- Today, there is no rigorous mathematical proof for this. It is not 
even known whether there are infinitely many Sophie Germain primes. On the 
other hand, it is conjectured and there are heuristic arguments and numerical 
evidence that the number of k-bit Sophie Germain primes is asymptotically 
equal to c- 2F 2 where c is an explicitly computable constant ([Koblitz88]; 
[BatHor62]; [BatHor65]). Thus, there is convincing evidence for the existence 
of sufficiently many Sophie Germain primes. 


The Strong RSA Assumption. The security of the Cramer-Shoup signa- 
ture scheme is based on the following strong RSA assumption introduced in 
[BarPfi97]. 

Let I := {n €N| n= pg, p ¥ g prime numbers , |p| = |¢|} be the set of 
RSA moduli and I, := {n € I | n = pq, |p| = |¢q| = &}. 


Definition 10.10 (strong RSA assumption). For every positive polynomial 
Q and every probabilistic polynomial algorithm A which on inputs n € I and 
y € Z* outputs an exponent e > 1 and an x € Z*, there exists a ko € N such 


that 1 
prob(#® =y:n“ k,y “ Z*,(e,2) — A(n,y)) < ~~ 
(@ = yint Koy * 3, (e2) — Al) < ap 


for k > ko. 


Remark. The strong RSA assumption implies the classical RSA assumption 
(Definition 6.7). In the classical RSA assumption, the attacking algorithm has 
to find an e-th root for y € Z*, for a given e. Here the exponent is not given. 
The adversary is successful if, given some y € Z>, she can find an exponent 
e > 1, such that she is able to extract the e-th root x of y. Today, the only 
known method for breaking either assumption is to solve the factorization 
problem. 


Let 
Igqg := {n €I|n=pq,p = 2p+1,¢ = 2¢ + 1,p, ¢ Sophie Germain primes} 
and Isq,n := Isq N Ip. 


Lemma 10.11. Assume that there is a positive polynomial P, such that the 
number of k-bit Sophie Germain primes is > 2" /P(k)- Then the strong RSA 
assumption implies that for every positive polynomial Q and every probabilis- 
tic polynomial algorithm A which on inputs n € Isq and y € Z* outputs an 
exponent e > 1 and an x € Z*, there exists a ky € N such that 
u u 1 
prob(x® = y:n< Isan,y — Zp, (e,2) — A(n,y)) < = 


Q(k) 
fork => ko. 
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Proof. The distribution n “ Igqa.x is polynomially bounded by the distribu- 
tion n “ Ij, if the existence of sufficiently many Sophie Germain primes is 
assumed. Thus, we may replace n “ I, by n “ Iga, in the strong RSA 
assumption (Proposition B.26). 


The Cramer-Shoup Signature Scheme. In the key generation and in the 
signing procedure, a probabilistic polynomial algorithm GenPrime(1*) with 
the following properties is used: 


1. On input 1*, GenPrime outputs a k-bit prime. 

2. If GenPrime(1*) is executed R(k) times (R a positive polynomial), then 
the probability that any two of the generated primes are equal is negli- 
gibly small; i.e., for every positive polynomial P there is a kg € N, such 
that for all k > ko 


prob(e;, = e;, for some j1 # jo : e; — GenPrime(1*),1 < 7 < R(k)) 
1 


ees 
(k) 

Such algorithms exist. For example, an algorithm that randomly and uni- 

formly chooses primes of binary length k satisfies the requirements. Namely, 

the probability that e;, = ej, (ji and je fixed, j; 4 j2) is about /2* by the 

Prime Number Theorem (Theorem A.68), and there are (2) < R(k)*/ 


subsets {j1,j2} of {1,...,R(k)}. There are suitable implementations of 
GenPrime which are much more efficient than the uniform sampling algo- 
rithm (see [CraSho2000]). 

Let N € N,N > 1 be a constant. To set up a Cramer-Shoup signature 
scheme, we choose two security parameters k and 1, with k!/N <141<k-1. 
Then we choose a collision-resistant hash function h: {0,1}* —> {0,1}!. 
More precisely, by using H’s key generator, we randomly select a hash func- 
tion from ;, where # is a collision-resistant family of hash functions and 1, 
is the subset of functions with security parameter | (without loss of generality, 
we assume that the functions in H; map to {0, 1}/). We proved in Section 10.2 
that such collision-resistant families exist if the RSA assumption and, as a 
consequence, the factoring assumption hold. The output of h is considered as 
a number in {0,...,2! —1}. All users of the scheme generate their signatures 
by using the hash function h.® 

Given k,l and h, each user Alice generates her public and secret key. 


Key Generation. 


1. Alice randomly chooses a modulus n & Isqa.k, i.e., she randomly and 
uniformly chooses Sophie Germain primes p and q of length k — 1 and 
sets n:= pq, p:= 2p+1 and q:= 2q¢+1. 


8 In practice we may use, for example, k = 512,! = 160 and h =SHA-1, which is 
believed to be collision resistant. 
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2. She chooses g “ QR,, and x “ QR,, at random and generates an (J+1)- 
bit prime é := GenPrime(1'+?). 
3. (n,g, x, é) is the public key; (p,q) is the secret key. 


Remark. Using Sophie Germain primes ensures that the order pt . a1 =p-q 
of QR,, is a product of distinct primes. Thus it is a cyclic group; it is the 


cyclic subgroup of order pq of Z*. 


In the following, all computations are done in Z* unless otherwise stated 
and, as usual, we identify Z,, = {0,...,n— 1}. 
Signing. It is possible to sign arbitrary messages m € {0,1}*. To sign m, 
Alice generates an (J + 1)-bit prime e := GenPrime(1'*1) and randomly 
chooses 7 “ QR,,. She computes 


where e~! is the inverse of e in Zi(n) (the powers are computed in Z*, which 
is of order y(n)). The signature o of m is (e, y, 9). 


Remarks: 


1. Taking the e~!-th power in the computation of y means computing the 
e-th root in Z*. Alice needs her secret key for this computation. Since 
le] =1+1<k-—1= |p| = |q|, the prime e does not divide y(n) = 4pq. 
Hence, Alice can easily compute the inverse e~! of e in Zn) by using her 
secret p and gq (and the extended Euclidean algorithm, see Proposition 
A.16). 

2. Signing is a probabilistic algorithm, because the prime e is generated 
probabilistically and a random quadratic residue 4 is chosen. After these 
choices, the computation of the signature is deterministic. Therefore, we 
can describe the signature o of m as the value of a mathematical function 
sign: 

o = sign(h,n,g,x,é,e,9,m). 
To compute the function sign by an algorithm, Alice has to use her knowl- 
edge about the prime factors of n. 


Verification. Recipient Bob verifies a signature 0 = (e,y,%) of Alice for 
message m as follows: 
1. First, he checks whether e is an odd (J+-1)-bit number that is not divisible 
by eé. 
2. Then he computes 


and checks whether : 
(i — ye A Gore: 


He accepts if both checks are affirmative; otherwise he rejects. 
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Remarks: 


1. Note that the verification algorithm does not verify that e is a prime; 
it only checks whether e is odd. A primality test would considerably 
decrease the efficiency of verification, and the security of the scheme 
does not require it (as the security proof shows). 

2. If Alice generates a signature (e, y, Y) with e = é, then this signature is not 
accepted by the verification procedure. However, since both e and € are 
generated by GenPrime, this happens only with negligible probability, 
and Alice could simply generate a new prime in this case. 


Theorem 10.12. /f the strong RSA assumption holds, “many” Sophie Ger- 
main primes exist and H is collision resistant, then the Cramer-Shoup sig- 
nature scheme is secure against adaptively-chosen-message attacks. 


Remark. There is a variant of the signature scheme which does not require 
the collision resistance of the hash function (see [CraSho2000]). The fam- 
ily H is only assumed to be a universal one-way family of hash functions 
([NaoYun89]; [BelRog97]). The universal one-way property is weaker than 
full collision resistance: if an adversary Eve first chooses a message m and 
then a random key 7 is chosen, it should be infeasible for Eve to find m’ 4 m 
with h;(m) = hi(m’). Note that the size of the key can grow with the length 
of m. 


In the proof of the theorem we need the following technical lemma. 


Lemma 10.13. There is a deterministic polynomial algorithm that for all k, 
given n € Iga.n, an odd natural number e with jel <k—1, a number f and 


elements u,v € Z* with u° = vf as inputs, computes the r-th root uw” € Ze 
of v for r := &/q and d := gcd(e, f). 


Proof. e and hence r and d are prime to y(n), since y(n) = 4pq, with Sophie 
Germain primes p and q of binary length & — 1, and e is an odd number 
with |e] < k — 1. Thus, the inverse elements r~' of r and d~' of d in Z¥,,,) 
exist and can be computed by the extended Euclidean algorithm (Proposition 
A.16). Let s := f/g. Since r is prime to s, the extended Euclidean algorithm 
(Algorithm A.5) computes integers m and m’, with sm + rm! = 1. We have 


=f 4 
u" = (uc)? = (vf)" =v* 
and 
(u” ym yr = (v*)™ . (vr pyomtrm! it 
By setting 
ad —y™. yr 


we obtain the r-th root of v. 
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Proof (of Theorem 10.12). The proof runs by contradiction. 
Let Forger(h, n, g, x, €) be a probabilistic polynomial forging algorithm which 
adaptively requests the signatures for t messages, where t = R(k) for some 
polynomial R, and then produces a valid forgery with non-negligible prob- 
ability for infinitely many security parameters (k,1). By non-negligible, we 
mean that the probability is > 1/Q(k) for some positive polynomial Q. 

We will define an attacking algorithm A that on inputs n € Igq and 
z € Z* successfully computes an r-th root modulo n of z, without knowing 
the prime factors of n (contradicting the strong RSA assumption, by Lemma 
10.11). 

On inputs n € Igq,, and z € Z*, A works as follows: 


1. Randomly and uniformly select the second security parameter | and ran- 
domly choose a hash function h € H; (by using H’s key generator). 

2. In a clever way, generate the missing elements g,x and é of a public key 
(n, 9,2, €). 

3. Interact with Forger to obtain a forged signature (m,o) for the public 
key (n,g, 2, €). 
Since the prime factors of n are not known in this setting, Forger can- 
not get the signatures he requests from the original signing algorithm. 
Instead, he obtains them from A. Since g, x and é were chosen in a clever 
way, A is able to supply Forger with valid signatures without knowing 
the prime factors of n. 

4. By use of the forged signature (m,o), compute an r-th root modulo n of 
z for some r > 1. 


A simulates the legitimate signer in step 3. We therefore also say that Forger 
runs against a simulated signer. Simulating the signer is the core of the proof. 
To ensure that Forger yields a valid signature with a non-negligible proba- 
bility, the probabilistic setting where Forger operates must be identical (or 
at least very close) to the setting where Forger runs against the legitimate 
signer. This means, in particular, that the keys generated in step 2 must be 
distributed as are the keys in the original signature scheme, and the signatures 
supplied to Forger in step 3 must be distributed as if they were generated by 
the legitimate signer. 

We denote by m;,l1 < 7 < t, the messages for which signatures are 
requested by Forger, and by o; = (e;,y:,%) the corresponding signatures 
supplied to Forger. Let (m,o) be the output of Forger, i.e., m is a mes- 
sage # m;,,1 <i < t, and o = (e,y,¥%) is the forged signature of m. Let 
E, = 9 -g hms) and & = J - g—h™, 

We distinguish three (overlapping) types of forgery: 

1. Type 1. For some i, 1 <i < t, e; divides e and & = Z;. 
2. Type 2. For some i, 1 <i <t, e; divides e and & F &%;. 
3. Type 3. For all 21, 1 <i <t, e; does not divide e. 
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Here, note that the number e in the forged signature can be non-prime (the 
verification procedure does not test whether e is a prime). The numbers e; 
are primes (see below). 

We may define a forging algorithm Forger, which yields the output of 
Forger if it is of type 1, and otherwise returns some message and signature 
not satisfying the verification condition. Analogously, we define Forger, and 
Forger3. Then the valid forgeries of Forger, are of type 1, those of Forgers, 
are of type 2 and those of Forger3 of type 3. If Forger succeeds with non- 
negligible probability, then at least one of the three “single-type forgers” 
succeeds with non-negligible probability. Replacing Forger by this algorithm, 
we may assume from now on that Forger generates valid forgeries of one type. 


Case 1: Forger is of type 1. To generate the public key in step 2, A 
proceeds as follows: 


1. Generate (J + 1)-bit primes e; := GenPrime(1'*1), 1 <i < t. Set 


2Thh<ice%, 


gi=2 
2. Randomly choose w “ Z* and set 


x= wilhsicee, 


3. Set €:= GenPrime(1'+*). 


To generate the signature for the i-th message m;, A randomly chooses 
9 “ QR,, and computes 


-1 


Lys af . rag and y; := (« . gh) ‘ 


Though A does not know the prime factors of n, she can easily compute 
the e;-th root to get y;, because she knows the e;-th root of x and g by 
construction. 

Forger then outputs a forged signature o = (e,y,%) for a message m ¢ 
{my,..., me}. If the forged signature does not pass the verification procedure, 
then the forger did not produce a valid signature. In this case, A returns a 
random exponent and a random element in Z*, and stops. Otherwise, Forger 
yields a valid type-1 forgery. Thus, for some j, 1 < j < t, we have e; | e, 
C= Xj and 


~é 


gaa gh), gaa. gh), 


Now A has to compute an r-th root of z. If h(m) 4 h(m,), which happens 
almost with certainty, since H is collision resistant, we get, by dividing the 
two equations, the equation 


7 g* oe ethics ei 
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with 0 < a < 2!. Here recall that h yields /-bit numbers and these are < 2!. 
Then a := h(m) — h(m,;) if h(m) — h(m,;) > 0; otherwise a := h(m;) — h(m). 
Since é is an (1+1)-bit prime and thus > 2!, é does not divide a. Moreover, é 
and the e; were chosen by GenPrime. T hus with a high probability (stated 
more precisely below), € 4 e;,1 <i < t. In this case, A can compute 2° 
using Lemma 10.13 (note that é is an (1+ 1)-bit prime and/+1<k-1).A 
returns the exponent é and the é-th root 2° 

We still have to compute the probability of success of A. Interacting with 
the legitimate signer, Forger is assumed to be successful. This means that 
there exists a positive polynomial P, an infinite subset KC C N and an I(k) for 
each k € K, such that Forger produces a valid signature with a probability 
> !/p(x) for all the pairs (k,1(k)),k € K, of security parameters. To state 
this eu precisely, let M;(h,n, 9, x, €, (mj, 0;)1<j<i) be the random variable 
describing the choice of m; by Forger (1 <i < t). The signatures o; are 
additional inputs to Forger. We have for k € K and 1 := I(k): 


Psuccess, Forger(k, 1) = prob(Verify(h, n, 9,2, é,m, a) = accept : 

h—Hyn& Isc.r, 
ge QR,,, 2 & QR,,,€6<— GenPrime(1'**), 
e; — GenPrime(1't"), 9; “ QR, 
m; — Mi(h,n, 9,2, é, (mj, 0;)1<j<); 
o; = sign(h, n, 9, x, €,e:,9i,mi), 1 <1 <t, 
(m,o) — Forger(h,n, 9,2, é, (oi)1<i<t) ) 

oe 

~ P(k) 

Here recall that the signature o; of m; can be derived deterministically from 


h,n,g, x, €,e;, 4; and m; as the value of a mathematical function sign (see p. 
276). 

Let o(z,(eiicice) = g = Zlhsise* and x(w, (edisice) = @ = 
w’lh<i<t* be the specific g and x constructed by A. A(n, z) succeeds in com- 
puting the root z® if the Forger produces a valid signature, if h(m) 4 h(m,) 
and if €é#e;,1 <i <t. All other steps in A are deterministic. Therefore, we 
may compute the probability of success of A (for security parameter k) as 
follows: 

prob(v” = z:n< Isa.n,z & Zk, (v,r) — A(n, z)) 
2 prob(Verify(h, n, Wz, (€4)i)s x(w, (6;)4)5 é, mM, a) _ accept, 
h(m) 4 h(m;),é Ae, 1 <i<t: 
h—Hyn& Igan,z & Zt,w — Z*,é — GenPrime(1'*1), 
e; — GenPrime(1't"), j; & QR,,, 
M4 — Mi(h, n, wz, (es)a), x(w, (ex)a), é, (m,;, O75 )1<j<i)s 
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o; = sign(h, n, w(z, (e:):), x(w, (€:)i), €, es, Ys, Mi), 1 < 
(m,o) — Forger(h,n, (2, (ei)i), x(w, (€i)a), €s (Fi)isise ) 
=: Pl 

Let Q be a positive polynomial. GenPrime, when called polynomially times, 
generates the same prime more than once only with negligible probability, and 
h is randomly chosen from a collision-resistant family of hash functions. 
Thus, there is some kp such that both the probability that € # e; forl <<i<t 
and the probability that h(m) # h(m,) for 1 < i < t are > 1— V@Q(), for 
k > ko.® Hence, we get for k > ko that 


pi 2 prob(Verify(h, n, v(z, (ei)i), x(w, (ex)i,€,m, 7) = accept : 

he—Hn“ Isan,z — Ze,wH Zr, é« GenPrime(1'*1), 
e; — GenPrime(1'*1), 9, “& QR,,, 

m, — M,(h,n, v(z, (ei); 
o; = sign(h, n, W(z, (e4)4 
(m,o) — Forger(h,n, w 
1 1 

(0-95) Caw) 


=: p2 


5 (3,0; )1<j<i), 
), Cis an My) l<ikct, 
( 


z, (es)a); see. (e:),),8 €, (oi )1<i<t) ) 


The first factor in pg is the probability that Forger successfully yields a 
valid signature when interacting with the simulated signer. This probability is 
equal to Forger’s probability of success, Psuccess,Forger (kK, !), when he interacts 
with the legitimate signer. Namely, w(z, (e;);) and x(w, (e;);) are uniformly 
distributed quadratic residues, independent of the distribution of the e;, since 
z “ Z* and w & Z*.!° Thus, we may replace a(z,(ei)i),z “ Z* and 
x(w, (e;);),w “ Z* by g “ QR,, and x “ QR,,. We get 


1 2 
= Psuccess, Forger kyl -{1-—_ , 
P= Ho rob) (1— gy) 


For k € K and 1 = I(k), we have Pguccess, Forger (kK, 1) > 1/P(k). The probability 
that A chooses I(k) in her first step is > 1/p, and we finally obtain that 


u u iy: ol L-\ 
prob(v” = z:n<— Isan,z — Zi, (v,r) — A(n, z)) = . (1 ) ; 
Or) Am) > sp EC- Bw 
for the infinitely many k € K. This contradicts the strong RSA assumption. 
The proof of Theorem 10.12 is finished in the case where the forger is of 
type 1. The other cases are proven below. 


° Note that the messages m and m, are generated by a probabilistic polynomial 
algorithm, namely Forger. 

10 Here note that [], ec: and é are prime to y(n) = 4pq, because é and the e; are 
(1+ 1)-bit primes and /+1<k—1= |p| = |q. 
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Remark. Before we study the next cases, let us have a look back at the proof 
just finished. The core of the proof is to simulate the legitimate signer and 
to supply valid signatures to the forger, without knowing the prime factors 
of the modulus n. We managed to do this by a clever choice of the second 
parts of the public keys. Here, the key point is that given a fixed first part 
of the public key, i.e., given a modulus n, the joint distribution of the second 
part (g,2,€) of the public key and the generated signatures is the same in 
the legitimate signer and in the simulation by A. This fact is often referred 
to as “the simulated signer perfectly simulates the legitimate signer”. 


Proof (of cases 2 and 3). 


Case 2: Forger is of type 2. We may assume that the 7 with e, | e and 
& # x; is fixed. Namely, we may guess the correct j, i.e., we iterate over the 
polynomially many cases 7 and assume j fixed in each case (see Chapter 7, 
proof of Theorem 7.7, for an analogous argument). 

To generate the missing elements g, x and é of the public key, A proceeds 
as follows: 


1. Generate (1 + 1)—bit primes e; := GenPrime(1't!), 1 < i < t. Choose a 
further prime é := GenPrime(1'*"). Set 


gia Pe lhias &, 


2. Randomly choose w “ Z* and u“ Z*. Set 


y= w? lies and gee 


3. Let v= y;" ng hey, 


Then g and « are uniformly distributed quadratic residues, since z,w and u 
are uniformly distributed (and the exponents é and e; are prime to y(n), see 
the footnote on p. 281). 

To generate the signature (e;,y;,9;) for the i-th message m,;, requested 
by Forger, A proceeds as follows: 


1. Ifi #7, then A randomly chooses 9; “ QR,, and computes 


-1 


j= gf gh and ys = (a gh) * 


A can compute the e;-th root, because the e;-th roots of g and x are 
known to her by construction. 

2. If = 7, then the value of y; has already been computed above. Moreover, 
A can compute the correct value of 9; = (&;g"))®', because she 
knows the é-th root of ; and g. Note that y; is uniformly distributed, 
as required, since Z; is uniformly distributed. 
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It is obvious from the construction that A generates signatures which satisfy 
the verification condition. Forger outputs a forged signature o = (e,y,%) 
for a message m ¢ {my,..., mez}. If the forged signature does not pass the 
verification procedure, A returns a random exponent and a random element 
in Z*, and stops. Otherwise, Forger yields a valid type-2 forgery, such that 
e; divides e, ie.,e=e;- f and 9: gh) ~¢4 x;. We have 
y= (y!)° =z-g"® and yj =a g? a), 

Now A has to compute an r-th root of z. If h(@) 4 h(Z;), which happens 
almost with certainty, since H is collision resistant, we get, by dividing the 
two equations, the equation 


i = g° = 729 ig; es 


with 0 < a < 2!. Since all (J + 1)-bit primes are chosen by GenPrime(1'*"), 
the probability that e; is equal to é or equal to an e;,i F j, is negligibly 


small. If e; is different from é and the e;,2 # j, then we can compute 2% 
by Lemma 10.13. In this case, A returns the exponent e; and the e;-th root 
-1 


re 


A(n, z) succeeds in computing the root 2° " if the Forger produces a valid 
signature, if h(z) # h(Z;) and if e; is different from é and the e;,i 4 7. As in 
case 1, it follows from the construction that A perfectly simulates the choice of 
the key together with the generation of signatures supplied to Forger. Thus, 
as we have seen in case 1, the Forger’s probability of success if he interacts 
with the simulated signer is the same as if he interacted with the legitimate 
signer. Computing the probabilities in a completely analogous way as in case 
1, we derive that 

U uu 1 
prob(vw” = z:n<— Isan,z — ZF, (v,r) — A(n, z)) > =, 
S(k) 
for some positive polynomial S and infinitely many k. This contradicts the 
strong RSA assumption (Lemma 10.11). 


Case 3: Forger is of type 3. To complete the public key by g,x and é, A 
proceeds as follows: 


1. Generate (1 + 1)-bit primes é and e;, 1 <i < t, by applying 
GenPrime(1'+1). Set 


2é[], e: 
giz es 


2. Choose a “ {1,...,n?} and set x := g*. 
A can easily generate valid signatures (e;, yi, ¥;) for messages m;,1 <i < t, 
requested by Forger. Namely, A chooses % “ QR,, and computes %; = 


i gh) and y; = (a+ g(@)%". The latter computation works, because 
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due to the construction, the e;-th roots of g and x can be immediately derived 
forl<i<t. 

Then Forger outputs a forged signature o = (e,y,¥) for a message m ¢ 
{m,..., mz}. The signature is of type 3, i.e., e; does not divide e, 1 <i<t. 
If the signature is valid, we get the equation 


yo =a-gh®) = 2f where f = 2¢| Te -(a+ h(Z)), 


To compute a root of z, let d = gcd(e, f). Observe that e may be a non- 
prime, because the verification algorithm only tests whether e is odd. If d < e, 
i.e., e does not divide f, then r := &/q > 1, and we can compute the r-th root 
2” by Lemma 10.13. 

Thus, A succeeds in computing a root if Forger yields a valid forgery and 
if e does not divide f. To compute the probability that Forger yields a valid 
forgery, we have to consider the distribution of the keys (g,x), generated by 
A in step 2. 

By the definition of n, QR,, is a cyclic group of order pq with distinct 
Sophie Germain primes p and q. 

Note that é and the e; are (1 + 1)-bit primes, and 1+1< k—1. Thus, é 
and none of the e; are equal to p or g, which are (k — 1)-bit primes. Hence, 
é]],; e: is prime to y(n) = 46g. We conclude that g is uniformly distributed 
in QR,,, since z is uniformly chosen from Z>. 

Let a = bpg+c,0 < c < pq (division with remainder). Now 2*-? < 
Da< 2** anda {1,...,n?} is uniformly chosen. This implies 
that the probability of a remainder c,0 < c < pq, differs from 5g by at most 
I/,2 ~ 1/o4*, This means that the distribution of ¢ is polynomially close to the 
uniform distribution. This in turn implies that the conditional distribution 
of x, assuming that g is a generator of QR,,, is polynomially close to the 
uniform distribution on QR,,. However, QR,, = Zgg = Zs x Zz and therefore 
(p—1)(q—1) of the pg elements in QR,, are generators. Thus, the probability 
that g is a generator of QR,, is > 1 — 1/2*-3, which is exponentially close to 
1. Summarizing, we get that the distribution of x is polynomially close to the 
uniform distribution on QR,,. 

We see that A almost perfectly simulates the legitimate signer. The dis- 
tributions of the keys and the signatures supplied to Forger are polynomially 
close to the distributions when Forger interacts with the legitimate signer. 
By Lemmas B.21 and B.24, we conclude that the probability that Forger 
produces a valid signature, if he interacts with the simulated signer in A, 
cannot be polynomially distinguished from his probability of success when 
interacting with the legitimate signer. Thus, Forger produces in step 3 of A 
a valid signature with probability > 1/Q(k) for some positive polynomial Q 
and infinitely many k. 


Qk-1 n2 x 


Exercises 285 


We still have to study the conditional probability that e does not divide 
f, assuming that Forger produces a valid signature. It is sufficient to prove 
that this probability is non-negligible. We will show that it is > 1/2. If we 
could prove this estimate assuming h,n, g, x, é and the forged signed message 
(m, a) fixed, for every h € Hy,n € Iga.z, every g,x and é€ possibly generated 
by A, and every valid (m,o) possibly output by Forger, then we are done 
(take the sum over all h,n,g,x,é,m and o). Therefore, we now assume that 
h,n,g,x and €, and m and o are fixed. This implies that c and & are also 
fixed. 

Let s be a prime dividing e. Then s > 2 and s ¥ é, because otherwise the 
verification condition was not satisfied. Moreover, s 4 e;, since the forgery 
is of type 3. Thus, it suffices to prove that s does not divide a + h(%) with 
probability > 1/2, assuming h,n,g,2,é,m and o fixed. Let a = bpg+c, as 
above. a+ h(&) = bp¢ +c+h(Z) = L(b), with L a linear function (note that 
c and & are fixed). The probability that s divides a + h(Z) is the same as 
the probability that L(b) = 0 mod s. Now the conditional distribution of b, 
assuming c fixed, is also polynomially close to the uniform distribution on 
10; was |” oq] }- Thus, the distribution of 6 mod s is polynomially close to 
the uniform distribution. s does not divide pq, because |s| <1 +1<k-—-1 
and |p| = |¢@| = k—1. Thus, L(b) = Omods is a non-vanishing linear 
equation over Z,. Hence, the probability that L(b) = 0 mod s is very close 
to l/s. This means that s and hence e do not divide f, with a probability 
>1- Ws—1) > Y2 (recall that s > 2). 

Now the proof of case 3 is finished, and the proof of Theorem 10.12 is 
complete. 


Exercises 


1. Consider the construction of collision-resistant hash functions in Section 
10.2. Explain how the prefix-free encoding of the messages can be avoided 
by applying Merkle’s meta method (Section 3.4.2). 


2. Let IT = (Ix)ren be a key set with security parameter k, and let 
H = (hj: {0,1}* — (0,124) oy be a family of collision-resistant 
hash functions. Let 1; > g(k(i)) + k(i) for all i € J, and let {0,1}S"% := 
{m € {0,1}* | 1 < |m| <1;} be the bit strings of length < 1;. 

Show that the family 
(he{Oty Se; 1800) 
is a family of one-way functions (with respect to H’s key generator). 


3. The RSA and ElGamal signature schemes are introduced in Chapter 3. 
There, various attacks against the basic schemes (no hash function is 
applied to the messages) are discussed. Classify these attacks and their 
levels of success (according to Section 10.1). 
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10. Provably Secure Digital Signatures 


The following signature scheme was suggested by Ong, Schnorr and 
Shamir ([OngSchSha84]). Alice chooses two random large distinct primes 
p and q, and a random x € Z*, where n := pq. Then she computes 
y := —x~* € Z* and publishes (n,y) as her public key. Her secret is x 
(she does not need to know the prime factors of n). To sign a message 


m € Zy, she randomly selects an r € Z* and calculates (in Z,) 
8,:=271 Ce mn + r) and sq :=27'x ee mn — r) : 


($1, $2) is the signature of m. To verify a signature (s1, 82), Bob checks 
that m = s7 + ys% (in Zn): 
a. Prove that retrieving the secret key by a key-only attack is equivalent 
to the factoring of n. 
b. The scheme is existentially forgeable by a key-only attack. 
c. The probability that a randomly chosen pair (s1, 82) is a signature 
for a given message m is negligibly small. 
d. State the problem, which an adversary has to solve, of forging signa- 
tures for messages of his choice by a key-only attack. 
(In fact, Pollard has broken the scheme by an efficient algorithm for 
this problem, see [PolSch87].) 


Let I = (Ik)ken be a key set with security parameter k, and let 
fo = (fo, : Di — Didier, fr = (fis : Di — Di)icr be a claw-free pair 
of trapdoor permutations with key generator kK. We consider the fol- 
lowing signature scheme. As her public key, Alice randomly chooses an 
index i € Ix — by computing K(1") — and a reference value x “ Dj. 
Her private key is the trapdoor information of fo,; and f,;. We encode 
the messages m € {0,1}* in a prefix-free way (see Section 10.2), and 
denote the encoded m by [m]. Then Alice’s signature of a message m 
is o(i,2,m) := fim a(2)> where fim, is defined as in Section 10.2. Bob 
can verify Alice’s signature o by comparing fimj,;(o) with x. Study the 
security of the scheme. More precisely, show 
a. A claw of fp and f; can be computed from o(i,2,m) and o(i, x, m’) 
if the messages m and m’ are distinct. 
b. The scheme is secure against existential forgery by a key-only attack. 
c. Assume that the message space has polynomial cardinality. More pre- 
cisely, let c € N and assume that only messages m € {0, 1}¢L!os2(*)! 
are signed by the scheme. Then the scheme is secure against 
adaptively-chosen-message attacks if used as a one-time signature 
scheme (i.e., Alice signs at most one message with her key). 
d. Which problem do you face in the security proof in c if the scheme 
is used to sign arbitrary messages in {0,1}*? 


We consider the same setting as in Exercise 5. We assume that the gener- 
ation of the messages m € {0,1}* to be signed can be uniformly modeled 
for all users by a probabilistic polynomial algorithm M/(¢). In particular, 
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this means that the messages Alice wants to sign do not depend on her 
reference value 2. 

Let (i,2) be the public key of Alice, and let m,; be the j-th mes- 
sage to be signed by Alice. The signature o(i,2,m,) of m, is defined 


as o(i,2,m;) := (s;,[m]||... |[m;_a]), with s; = Fim] (85-1): Here 
[m1]|...||[™,;-1] denotes the concatenation of the prefix-free encoded 
messages, and s9 := x is Alice’s randomly chosen reference value (see 
[GolBel01)): 


a. Show by an example that in order to prevent forging by known sig- 
nature attacks, the verification procedure has to check whether the 
bit string 7 in a signature (s,77) is well formed with respect to the 
prefix-free encoding. 

Give the complete verification condition. 

b. Prove that no one can existentially forge a signature by a known- 

signature attack. 


A. Algebra and Number Theory 


Public-key cryptosystems are based on modular arithmetic. In this section, we 
summarize the concepts and results from algebra and number theory which 
are necessary for an understanding of the cryptographic methods. Textbooks 
on number theory and modular arithmetic include [HarWri79], [IreRos82], 
[Rose94], [Forster96] and [Rosen2000]. This section is also intended to es- 
tablish notation. We assume that the reader is familiar with the elementary 
notions of algebra, such as groups, rings and fields. 


A.1 The Integers 


Z denotes the ring of integers; N = {z € Z| z > 0} denotes the subset of 
natural numbers. 

We first introduce the notion of divisors and the fundamental Euclidean 
algorithm which computes the greatest common divisor of two numbers. 


Definition A.1. Let a,b € Z: 


1. a divides b if there is some c € Z, with b = ac. We write a|b for “a 
divides b”. 
2. d EN is called the greatest common divisor of a and 8, if: 
a. d|a and d|b. 
b. If t € Z divides both a and b, then t divides d. 
The greatest common divisor is denoted by gcd(a, b). 
3. If gcd(a,b) = 1, then a is called relatively prime to b, or prime to b for 
short. 


Theorem A.2 (Division with remainder). Let z,a € Z,a #0. Then there 
are unique numbers q,r € Z, such thatz=q-a+rand0<r< |al. 


Proof. In the first step, we prove that such qg and r exist. If a > 0 and z > 0, 
we may apply induction on z. For 0 < z < a we obviously have z= 0-a+z. 
If z > a, then, by induction, z-a = q-a+r for some gq and r,0<r<a, 
and hence z = (q+ 1)-a+r. If z < 0 and a > 0, then we have just shown 
the existence of an equation —z = q:a+r,0 <r<a. Then z = -q-a if 
r=0,and z=-q-a-—r=-q-a—a+(a-—r)=—(q4+1)-a+(a—r) and 
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0<a-r<a.Ifa<0, then —a > 0. Hence z= q-(-a)+r=-q-a+r 
with 0<r< |al. 

To prove uniqueness, consider z = qi -a+r, = qgg:a+rg. Then 0 = 
(q1 — q2)-a+(r1 — 12). Hence a divides (71 — rz). Since |r1 — r2| < |al, this 
implies r; = rg, and then also q, = q. 


d 


Remark. r is called the remainder of z modulo a. We write z mod a for r. 
The number q is the (integer) quotient of z and a. We write z div a for gq. 


The Euclidean Algorithm. Let a,b € Z,a > b> 0. The greatest common 
divisor gcd(a, 6) can be computed by an iterated division with remainder. Let 
ro :=a,7r, := band 

VG. S git te, 0<re<r1, 


ry = Qgret7s, O0<r3<Ta, 


Tr-1 =QkTk+Try1, O< TRI <Tk; 


Tn—-2 = In-1Tn-1 + Tn, O0< Tr <Tn-1; 
Tn-1 = InTn + Tnt+l) 0= Tnt+l- 
By construction, r; > rg > .... Therefore, the remainder becomes 0 after 


a finite number of steps. The last remainder 4 0 is the greatest common 
divisor, as is shown in the next proposition. 
Proposition A.3. 

1. Tn = gcd(a, db). 

2. There are numbers d,e € Z with gcd(a, b) = da + eb. 
Proof. 1. From the equations considered in reverse order, we conclude that 
Tn divides rz, k =n—1,n—2.... In particular, r, divides r; = b and ro = a. 
Now let ¢ be a divisor of a = rg and b = rj. Then t | rg,k = 2,3,..., and 
hence t | r,. Thus, r, is the greatest common divisor. 
2. Iteratively substituting rp41 by rg—1 — deTr, we get 

Tr = Tn-2 — Gn-1°Tn-1 
= Tn-2 — An 1° (Tn 3— Qn-2°Tn 2) 


=(1+a 19n 2)°Tn 2—- da—-1"* Tn-3 


=da-+ eb, 


with integers d and e. 


We have shown that the following algorithm, called Fuclid’s algorithm, 
outputs the greatest common divisor. abs(a) denotes the absolute value of a. 
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Algorithm A.4. 
int gcd(int a, b) 
1 while b #0do 


2 r—amod 0b 
3 ab 
4 ber 


5 return abs(a) 


We now extend the algorithm, such that not only gcd(a,b) but also the 
coefficients d and e of the linear combination gcd(a, b) = da+eb are computed. 
For this purpose, we write the recursion 


Te-1 = QkTk +Tk+1 


using matrices 


_ 01 
(7+ )=a(" *) where Qe = BAW Soc. 
Tk41 Tk 1 —a 


Multiplying the matrices, we get 


( ne ) = On QnatssQ wae 
Tn+1 Ty 


The first component of this equation yields the desired linear combination 
for rp, = gcd(a, b). Therefore, we have to compute Qn -Qn-1-..-: Qi. This is 
accomplished by iteratively computing the matrices 


to finally get A, = Qn - Qn—1-----Q1. In this way, we have derived the 
following algorithm, called the extended Euclidean algorithm. On inputs a 
and b it outputs the greatest common divisor and the coefficients d and e of 
the linear combination gcd(a, b) = da + eb. 
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Algorithm A.5. 
int array gcdCoef (int a, b) 
1 Ai < 1, A22 < 1, A12 < 0, A21 «+0 
2 whileb#0do 
q< adiv b 
r—amod b 


3 
4 
5 ab 
6 
a 
8 


ber 
to1 — Az1; t22 — A22 
Aor — Ar — @: A21 
9 A22 — A12 — q+ A22 
10 Ai cS ta 
11 A112 — too 
12 return (abs(a), At A112) 


We analyze the running time of the Euclidean algorithm. Here we meet 
the Fibonacci numbers. 
Definition A.6. The Fibonacci numbers f, are recursively defined by 
fo = 0, fi = 1, 
fn = fn—-1 + fn—2, for n > 2. 


Remark. The Fibonacci numbers can be non-recursively computed using the 
formula 


where g and g are the solutions of the equation x? = x + 1: 


g:= 5 (1+v5) and g:=1l-—g= arit v5). 


See, for example, [Forster96]. 
Definition A.7. ¢ is called the Golden Ratio.' 


Lemma A.8. For n > 2, fn > g"~?. In particular, the Fibonacci numbers 
grow exponentially fast. 


Proof. The statement is clear for n = 2. By induction on n, assuming that 
the statement holds for < n, we get 


Sarat = a 2h ae > gs sgh? = ial a +g) = gg" = oo. 


Proposition A.9. Let a,b € Z,a>b>0. Assume that computing gcd(a, b) 
by the Euclidean algorithm takes n iterations (i.e., using n divisions with 
remainder). Then a> fn4i and b> fn. 


' It is the proportion of length to width which the Greeks found most beautiful. 


A.1 The Integers 293 


Proof. Let ro := a,7r, := b and consider 


Tro = GT1 +72, Fatt = In + fn—1; 
Tr) = Qr2+Ts, fn = fav fra; 
and 
Tn—-2 = In-1Tn-1 + Tn, fs =fethi, 

Tn-1 = AnTn; fo — fi- 


By induction, starting with 7 = n and descending, we show that 7; > fn41-i. 
For i = n, we have r, > f; = 1. Now assume the inequality proven for > 7. 
Then 


ri=@aritrigg 2 rit risa = frti—-it fngi—-(4i1) = fnti-(-1)- 


Hence a=1ro > fn4i and b=r, > fn. 


Notation. As is common use, we denote by |2]| the greatest integer less 
than or equal to x (the “floor” of x), and by [a] the smallest integer greater 
than or equal to x (the “ceiling” of x). 


Corollary A.10. Let a,b € Z. Then the Euclidean algorithm computes 
gcd(a, b) in at most |log,(a)| +1 iterations. 


Proof. Let n be the number of iterations. From a > fn41 > g"~4 (Lemma 
A.8) we conclude n — 1 < |log,(a)]. 


The Binary Encoding of Numbers. Studying algorithms with numbers 
as inputs and outputs, we need binary encodings of numbers (and residues, 
see below). We always assume that integers n > 0 are encoded in the standard 
way as unsigned integers: 

The sequence zp_12,-2... 2120 of bits z; € {0,1},0 < i < k—1, is the 
encoding of 


k-1 
m= 2 ta 22+... 4+ 2-2 2PF + zy DAP = Sg 2. 
i=0 


If the leading digit z,_1 is not zero (i.e., 2,1 = 1), we call n a k-bit integer, 
and & is called the binary length of n. The binary length of n € N is usually 
denoted by |n|. Of course, we only use this notation if it cannot be confused 
with the absolute value. The binary length of n € N is |logy(n)| + 1. The 
numbers of binary length k are the numbers n € N with 2*-! <n < 2*—1. 


The Big-O Notation. To state estimates, the big-O notation is useful. 
Suppose f(k) and g(k) are functions of the positive integers k which take 
positive (not necessarily integer) values. We say that f(k) = O(g(k)) if there 
is a constant C such that f(k) < C- g(k) for all sufficiently large k. For 
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example, 2k? +k +1 = O(k?) because 2k? +k+ 1 < 4k? for all k > 1. In our 
examples, the constant C' is always “small”, and we use the big-O notation 
for convenience. We do not want to state a precise value of C. 


Remark. Applying the classical grade school methods, we see that adding and 
subtracting two k-bit numbers requires O(k) binary operations. Multiplica- 
tion and division with remainder can be done with O(k”) binary operations 
(see [Knuth98] for a more detailed discussion of time estimates for doing 
arithmetic). Thus, the greatest common divisor of two k-bit numbers can be 
computed by the Euclidean algorithm with O(k?) binary operations. 


Next we will show that every natural number can be uniquely decomposed 
into prime numbers. 


Definition A.11. Let p € N,p > 2. pis called a prime (or a prime number) 
if 1 and p are the only positive divisors of p. A number n € N which is not a 
prime is called a composite. 


Remark. If p is a prime and p| ab, a,b € Z, then either p| a or p| b. 


Proof. Assume that p does not divide a and does not divide b. Then there 
are di,d2,e1,e2 € Z, with 1 = dip + e,a,1 = dep + egb (Proposition A.3). 
Then 1 = dydgp? + dyegbp + e,adgp + e,e2ab. If p divided ab, then p would 
divide 1, which is impossible. Thus, p does not divide ab. 


Theorem A.12 (Fundamental Theorem of Arithmetic). Let n € N,n > 2. 
There are pairwise distinct primes pi,...,pr and exponents e€1,...,€r € 
N,e; >1,7=1,...,r, such that 


r 
= ei 
n= Dj: 
i=1 


The primes pi,...,Pr and exponents €1,...,€, are unique. 


Proof. By induction on n we obtain the existence of such a decomposition. 
n = 2 is a prime. Now assume that the existence is proven for numbers < n. 
Either n+ 1 is a prime or n+1=1-m, with l,m < n+ 1. By assumption, 
there are decompositions of / and m and hence also for n + 1. 

In order to prove uniqueness, we assume that there are two different decom- 
positions of n. Dividing both decompositions by all common primes, we get 
(not necessarily distinct) primes p),...,ps and qi,...,q@, with {pi,...,ps}M 
{qi,---,@} =O and p,-...-ps =q--.--q@. Since pi |qi-...-d@, we conclude 
from the preceding remark that there is an i,1 <i < t, with p; | q. This is a 
contradiction. 
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A.2 Residues 


In public-key cryptography, we usually have to compute with remainders 
modulo n. This means that the computations take place in the residue class 
ring Zp. 


Definition A.13. Let n € N,n > 2: 


1. a,b € Z are congruent modulo n, written as 
a= bmodn, 


if n divides a— b. This means that a and b have the same remainder when 
divided by n: a mod n = b mod n. 

2. Let a € Z. [a] := {a € Z| e = amod n} is called the residue class of a 
modulo m. 

3. Zy := {{a] | a € Z} is the set of residue classes modulo n. 


Remark. As is easily seen, “congruent modulo n” is a symmetric, reflexive 
and transitive relation, i.e., it is an equivalence relation. The residue classes 
are the equivalence classes. A residue class [a] is completely determined by 
one of its members. If a’ € [a], then [a] = [a’]. An element x € [a] is called 
a representative of [a]. Division with remainder by n yields the remainders 
0,...,2—1. Therefore, there are n residue classes in Zp: 


Zn = {[0],...,[n —1]}. 


The integers 0,...,2—1 are called the natural representatives. The natural 
representative of [xz] € Z, is just the remainder (x mod n) of x modulo n (see 
division with remainder, Theorem A.2). If, in the given context, no confu- 
sion is possible, we sometimes identify the residue classes with their natural 
representatives. 

Since we will study algorithms whose inputs and outputs are residue 
classes, we need binary encodings of the residue classes. The binary encoding 
of [a] € Z,, is the binary encoding of the natural representative « mod n as 
an unsigned integer (see our remark on the binary encoding of non-negative 
integers in Section A.1). 


Definition A.14. By defining addition and multiplication as 

[a] + [6] = [a + 4] and [a] - [6] = [a - 4, 
Z,y, becomes a commutative ring, with unit element [1]. It is called the residue 
class ring modulo n. 


Remark. The sum [a] + [b] and the product [a] - [b] do not depend on the 
choice of the representatives by which they are computed, as straightforward 
computations show. For example, let a’ € [a] and 0’ € [6]. Then n| a’ — a and 
n|b' — b. Hence n | a’ + b’ — (a+ 6), and therefore [a + }] = [a’ + 0’). 
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Doing multiplications in a ring, we are interested in those elements which 
have a multiplicative inverse. They are called the units. 


Definition A.15. Let R be a commutative ring with unit element e. An 
element x € R is called a unit if there is an element y € R with x-y = e. We 
call y a multiplicative inverse of x. The subset of units is denoted by R*. 


Remark. The multiplicative inverse of a unit x is uniquely determined, and 
we denote it by 2~!. The set of units R* is a subgroup of R with respect to 
multiplication. 


Example. In Z, elements a and 0 satisfy a:b = 1 if and only if both a and b 
are equal to 1, or both are equal to —1. Thus, 1 and —1 are the only units 
in Z. The residue class rings Z,, contain many more units, as the subsequent 
considerations show. For example, if p is a prime then every residue class in 
Z, different from [0] is a unit. An element [z] € Z,, in a residue class ring is 
a unit if there is a residue class [y] € Z, with [2] - [y] = [1], ie., n divides 
x-y-l. 


Proposition A.16. An element [x] € Z, is a unit if and only if gcd(x,n) = 
1. The multiplicative inverse [x|~' of a unit [x] can be computed using the 
extended Euclidean algorithm. 


Proof. If gcd(x,n) = 1, then there is an equation xb + nc = 1 in Z, and the 
coefficients b,c € Z can be computed using the extended Euclidean algorithm 
A.5. The residue class [}] is an inverse of [x]. Conversely, if [a] is a unit, then 
there are y,k € Z with «-y=1+k-n. This implies gcd(z,n) = 1. 


Corollary A.17. Let p be a prime. Then every [xz] 4 [0] in Zp is a unit. 
Thus, Zp is a field. 


Definition A.18. The subgroup 
Zy := {x € Z, | x is a unit in Z,} 
of units in Z, is called the prime residue class group modulo n. 


Definition A.19. Let M be a finite set. The number of elements in M is 
called the cardinality or order of M. It is denoted by |]. 


We introduce the Euler phi function, which gives the number of units modulo 
n. 


Definition A.20. 

p: NN, nt |Z | 
is called the Euler phi function or the Euler totient function. 
Proposition A.21 (Euler). 


S- p(d) =n. 


d|n 
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Proof. If d is a divisor of n, let Zq := {x | 1 < a < n,gcd(a,n) = ad}. 
Each k € {1,...,n} belongs to exactly one Zy. Thus n = 374) ,,|Zal- Since 
«++ t/q is a bijective map from Zq to ZF, 4, we have |Zq| = y("/d), and hence 


n= Vain P(Va) = Lan 9) 


Corollary A.22. Let p be a prime and k €N. Then y(p") = p*-1(p — 1). 


Proof. By Euler’s result, p(1) + y(p) +... + y(p*) = p® and v(1) + y(p) + 

... + y(p*-1) = p*-!. Subtracting both equations yields y(p*) = p* — p*-1 = 
k-1 

pe (pl), 

Remarks: 


1. By using the Chinese Remainder Theorem below (Section A.3), we will 
also get a formula for y(n) if n is not a power of a prime (Corollary A.30). 

2. At some points in the book we need a lower bound for the fraction P(n)/n, 
of units in Z,,. In [RosSch62] it is proven that 


n 


> 
e7 log(log(n)) + jaatiega 


, with Euler’s constant y = 0.5772.... 


y(n) 


This inequality implies, for example, that 


n 
pee = fey Se Ge 
eR? Sisson 


as a straightforward computation shows. 


The RSA cryptosystem is based on old results by Fermat and Euler.” 
These results are special cases of the following proposition. 


Proposition A.23. Let G be a finite group and e be the unit element of G. 
Then x'¢l = e for alla €G. 


Proof. Since we apply this result only to Abelian groups, we assume in our 
proof that the group G is Abelian. A proof for the general case may be found 
in most introductory textbooks on algebra. 

The map tz : G— G, g+— 2g, multiplying group elements by 2, is a 
bijective map (multiplying by «~! is the inverse map). Hence, 


I[lo= I, -9=2"" I 9. 


gEG gEG gEG 


and this implies z!@! = e. 


As a first corollary of Proposition A.23, we get Fermat’s Little Theorem. 


Proposition A.24 (Fermat). Let p be a prime and a € Z be a number that 
is prime to p (i.e., p does not divide a). Then 


a?-' = 1 mod p. 


? Pierre de Fermat (1601-1665) and Leonhard Euler (1707-1783). 
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Proof. The residue class [a] of a modulo p is a unit, because a is prime to p 
(Proposition A.16). Since |Z*| = p—1 (Corollary A.17), we have [a]?~' = 1 
by Proposition A.23. 


Remark. Fermat stated a famous conjecture known as Fermat’s Last Theo- 
rem. It says that the equation 7” + y” = z” has no solutions with non-zero 
integers x,y and z, for n > 3. For more than 300 years, Fermat’s conjecture 
was one of the outstanding challenges of mathematics. It was finally proven 
in 1995 by Andrew Wiles. 


Euler generalized Fermat’s Little Theorem. 


Proposition A.25 (Euler). Let n € N and let a € Z be a number that is 
prime ton. Then 

a?) = 1 mod n. 
Proof. It follows from Proposition A.23, in the same way as Proposition A.24. 
The residue class [a] of a modulo n is a unit and |Z*| = y(n). 


Fast Modular Exponentiation. In cryptography, we often have to com- 
pute a power x° or a modular power x° mod n. This can be done efficiently 
by the fast exponentiation algorithm. The idea is that if the exponent e is a 
power of 2, say e = 2", then we can exponentiate by successively squaring: 


a =a? = ((((...(a?))?...)?))?. 
In this way we compute x° by k squarings. For example, «1° = (((a?)?)?)?. 
If the exponent is not a power of 2, then we use its binary representation. 
Assume that e is a k-bit number, 2*~! < e < 2*. Then 


e=2F-le, 1 +2"-%e, 24...+24e,4+2%e9, (with e,_; =1) 
= (Pe ep-3 + 7 eee ae ae ia €1) -24+€9 
— (...((2en—-1 + Cx-2) +2 + ep~-3)-24+...+ 1) -2+ 69. 


Hence, 
= al (Zen—1 Fen—2)- 2+en—3)-2+...+e1)-2+e0 - 


Ay ee — 


a a eae Pe 


= (gbeiGen—aten—a)) eben =a)-2t-ben)\2 


We see that x° can be computed in k — 1 steps, with each step consisting of 
squaring the intermediate result and, if the corresponding binary digit e; of 
e is 1, an additional multiplication by x. If we want to compute the modular 
power «© mod n, then we take the remainder modulo n after each squaring 
and multiplication: 


x° mod n= 
(...(((a? » x®*-2 mod n)? « x®*-3 mod n)?-...)?-@®! mod n)? + 2° mod n. 


We obtain the following algorithm for fast modular exponentiation. 
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Algorithm A.26. 
int ModPower(int x, e,n) 


Ll yaa; 

2 for i — BitLength(e) — 2 downto 0 do 
3 yo y? : yp Bit(e,t) mod n 

4 return y 


In particular, we get 


Proposition A.27. Let 1 = |logye|. The computation of x° mod n can be 
done by | squarings, | multiplications and I divisions. 


Proof. The binary length & of e is |log,(e)| + 1. 


A.3 The Chinese Remainder Theorem 


The Chinese Remainder Theorem provides a method of solving systems of 
congruences. The solutions can be found using an easy and efficient algorithm. 


Theorem A.28. Let n1,...,n, € N be pairwise relatively prime numbers, 
i.e, gcd(ni,nj) = 1 fori # j. Let b,b2,...,b, be arbitrary integers. Then 
there is an integer b such that 


b= bh; modn;,, t=1,...,1r. 
Furthermore, the remainder b mod n is unique, where n = ny -...+ Np. 


The statement means that there is a one-to-one correspondence between 
the residue classes modulo n and tuples of residue classes modulo n1,...,n,. 
This one-to-one correspondence preserves the additive and multiplicative 
structure. Therefore, we have the following ring-theoretic formulation of The- 
orem A.28. 


Theorem A.29 (Chinese Remainder Theorem). Let ni,...,nr € N be pair- 


wise relatively prime numbers, i.e., ged(ni,n;) = 1, fori A 7. Letn = 
my:...:N,. Then the map 
Ww: Zn — Zn, x... Zn, [x] > ([e mod nj4],..., [x mod n,]) 


is an isomorphism of rings. 


Remark. Before we give a proof, we review the notion of an “isomorphism”. 
It means that q is a homomorphism and bijective. “Homomorphism” means 
that w preserves the additive and multiplicative structure. More precisely, a 
map f : R — R’ between rings with unit elements e and e’ is called a (ring) 
homomorphism if 


f(e) =e’ and f(a+b) = f(a) + f(b), f(a: b) = f(a): f(b) for all a,be R. 
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If f is a bijective homomorphism, then, automatically, the inverse map g = 
f—1 is also a homomorphism. Namely, let a’,b! € R’. Then a’ = f(a) and 
i = f(b), and g(a’ -¥) = g( Fla) - f()) = 9( f(a) = ab = g(a’) - 9(b") 
(analogously for + instead of -). 

Being an isomorphism, as ~w is, is an extremely nice feature. It means, in 
particular, that a is a unit in R if and only if f(a) is a unit R’ (to see this, 
compute e’ = f(e) = f(a-a~+) = f(a): f(a~'), hence f(a~*) is an inverse 
of f(a)). And the “same” equations hold in domain and range. For example, 
we have a? = b in R if and only if f(a)? = f(b) (note that f(a)? = f(a?)). 
Thus, 0 is a square if and only if f(b) is a square (we will use this example 
in Section A.7). 

Isomorphism means that the domain and range may be considered to be 
the same for all questions concerning addition and multiplication. 


Proof (of Theorem A.29). Since each n; divides n, the map is well defined, 
and it obviously is a ring homomorphism. The domain and range of the map 
have the same cardinality (i.e., they contain the same number of elements). 
Thus, it suffices to prove that w is surjective. 

Let ti; := "/n,; = Taxi nz. Then t; = 0 mod nx for all k 4 i, and gcd(ti, nj) = 
1. Hence, there is a d; € Z with d;-t; = 1 mod n; (Proposition A.16). Setting 
u; = d;-t;, we have 


u; = 0 mod ng, for all k #7, and u; = 1 mod nj. 


This means that the element (0,...,0,1,0,...,0) (the i-th component is 1, 


all other components are 0) is in the image of w. If ([zi],...,[vr]) € Zn, x 
... X Zp, is an arbitrary element, then 7()\;_, %i- ui) = ([x1],.--, [zr]). 
Remarks: 


1. Actually, the proof describes an efficient algorithm for computing a num- 

ber b, with b= b; mod n;, i = 1,...,r (recall our first formulation of 
the Chinese Remainder Theorem in Theorem A.28). In a preprocessing 
step, the inverse elements [d;] = [t;]~' are computed modulo n; using the 
extended Euclidean algorithm (Proposition A.16). Then 6 can be com- 
puted as b= )>;_, b; -d; - t;, for any given integers bj, 1 <i <r. 
We mainly apply the Chinese Remainder Theorem with r = 2 (for ex- 
ample, in the RSA cryptosystem). Here we simply compute coefficients 
d and e with 1 = d-n, + e- ne (using the extended Euclidean algorithm 
A.5), and then b=d-nj-bg +e-ng- by. 

2. The Chinese Remainder Theorem can be used to make arithmetic compu- 
tations modulo n easier and (much) more efficient. We map the operands 
to Zn, X...X Zp, by w and do our computation there. Z,, x... Zn, is 
a direct product of rings. Addition and multiplication are done compo- 
nentwise, i.e., we perform the computation modulo n,, for i = 1,...,r. 
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Here we work with (much) smaller numbers.’ Finally, we map the result 
back to Z, by ~~! (which is easily done, as we have seen in the preceding 
remark). 


As a corollary of the Chinese Remainder Theorem, we get a formula for 
Euler’s phi function for composite inputs. 
Corollary A.30. Letn €N andn=pj'-...- ps" be the decomposition of n 
into primes (as stated in Theorem A.12). Then: 


1. Zy is isomorphic to Z pel X ++. X Zee - 
2. Z* is isomorphic to Lie XR PF ee 
In particular, we have ok Euler’s pi function that 


Proof. The ring isomorphism of Theorem A.29 induces, in particular, an iso- 
morphism on the units. Hence, 


y(n) = p(pt')-...- v(pe”), 


and the formula follows from Corollary A.22. 
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Definition A.31. Let G be a finite group and let e be the unit element of 
G. Let x € G. The smallest n € N with x” = e is called the order of x. We 
write this as ord(z). 


Remark. There are exponents n € N, with «” = e. Namely, since G is finite, 
there are exponents m and m’,m < m’, with 2” = 2”. Then m’ —m > 0 


v 
and az” ~“™ =e. 


Lemma A.32. Let G be a finite group and x € G. Letn € N with a =e. 
Then ord(x) divides n. 


Proof. Let n = q-ord(x) +7r,0 <r < ord(x) (division with remainder). Then 
x” =e. Since 0 < r < ord(zx), this implies r = 0. 


Corollary A.33. Let G be a finite group and x € G. Then ord(x) divides 
the order |G| of G. 


Proof. By Proposition A.23, x!@l = e. 


Lemma A.34. Let G be a finite group and « € G. Let 1 € Zandd = 
gcd(l, ord(a)). Then ord(a!) = ord(x)/q. 


3 For example, if n = pq (as in an RSA scheme) with 512-bit numbers p and q, 
then we compute with 512-bit numbers instead of with 1024-bit numbers. 
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Proof. Let r = ord(a!). From (a!)o4@)/4 = (gord())'/4 — € we conclude 
r < ord(x)/g, Choose numbers a and b with d= a-1+b- ord(x) (Proposition 
A.3). From a7°@ = gh etrbord(2) — gira — e we derive ord(x) <r: d. 


Definition A.35. Let G be a finite group. G is called cyclic if there is an 
a € G which generates G, ie., G = {x,2?,23,...,0°)-1, ord) = ee}. 
Such an element «x is called a generator of G. 


Theorem A.36. Let p be a prime. Then Z} is cyclic, and the number of 
generators is p(p — 1). 


Proof. For 1 < d < p—1, let Sa = {x € Zj | ord(x) = d} be the units 
of order d. If Sy # 0, let a € Sq. The equation X74 — 1 has at most d 
solutions in Z,, since Z, is a field (Corollary A.17). Hence, the solutions 
of X¢—1 are just the elements of A := {a,a?,...,a%}. Each x € Sq isa 
solution of X¢ — 1, and therefore Sg C A. Using Lemma A.34 we derive 
that Sq = {a° | 1 <c < d,gcd(c,d) = 1}. In particular, we conclude that 
|Sa| = y(d) if Sa 4 (and an a € Sq exists). 

By Fermat’s Little Theorem (Proposition A.24), ZF is the disjoint union of 
the sets Sa, d|p—1. Hence |Z5| =p — 1 = 04) p-1|Sa|- On the other hand, 
p-1= di a)p-1 9(@) (Proposition A.21), and we see that |Sal = y(d) must 
hold for all divisors d of p — 1. In particular, |.S,-1| = y(p — 1). This means 
that there are p(p — 1) generators of Z5. 


Definition A.37. Let p be a prime. A generator g of the cyclic group Z>, is 
called a primitive root of ZF, or a primitive root modulo p. 


Remark. It can be proven that Z* is cyclic if and only if n is one of the 
following numbers: 1,2,4,p* or 2p"; p a prime, p > 3, k > 1. 
Proposition A.38. Let p be a prime. Then x € Zi, is a primitive root if and 


only if eP-)/4 F [1] for every prime q which divides p— 1. 


Proof. An element x is a primitive root if and only if x has order p—1. Since 
ord(x) divides p — 1 (Corollary A.33), either x?-)/4 = [1] for some prime 
divisor gq of p—1 or ord(x) = p— 1. 


We may use Proposition A.38 to generate a primitive root for those primes 
p for which we know (or can efficiently compute) the prime factors of p — 1. 


Algorithm A.39. 
int PrimitiveRoot(prime p) 
1 Randomly choose an integer g, withO<g<p-1 
2 if g—)4iva Z 1 mod p=, for all primes q dividing p — 1 
3 then return g 
4 else go to l 
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Since y(p — 1) > (p — 1)/g log(log(p — 1)) (see Section A.2), we expect to find 
a primitive element after O(log(log(p))) iterations (see Lemma B.12). 


No efficient algorithm is known for the computation of primitive roots for 
arbitrary primes. The problem is to compute the prime factors of p—1, which 
we need in Algorithm A.39. Often there are primitive roots which are small. 

Algorithm A.39 is used, for example, in the key-generation procedure of 
the ElGamal cryptosystem (see Section 3.5.1). There the primes p are chosen 
in such a way that the prime factors of p— 1 can be derived efficiently. 


Lemma A.40. Let p be a prime and let q be a prime that divides p—1. Then 
the set 
Gq = {x € Z| ord(x) = q or x = [1}}, 


which consists of the unit element [1] and the elements of order q, is a sub- 
group of Z,. Gq is a cyclic group, and every element x € Z;, of order q, t.e., 
every element x € Gq, « # [1], is a generator. Gg is generated, for example, 
by g'?—Y/4, where g is a primitive root modulo p. G, is the only subgroup of 
G of order q. 


Proof. Let x,y € Gq. Then (ry)? = #4y!% = [1], and therefore ord(xy) divides 
q. Since q is a prime, we conclude that ord(ay) is 1 or g. Thus zy € Gy, and 
G, is a subgroup of Z). Let h € Z;, be an element of order q, for example, h := 
g?-*/4, where g is a primitive root modulo p. Then {h°, h?, h?,...hI-1} C Gy. 
The elements of G, are solutions of the equation X4%—1 in Z,. This equation 
has at most q solutions in Z,, since Z, is a field (Corollary A.17). Therefore 
{h°, ht, h?,...h9~'} = Gg, and h is a generator of Gy. If H is any subgroup 
of order gq and z € H, z # [1], then ord(z) divides g, and hence ord(z) = gq, 
because q is a prime. Thus z € G,, and we conclude that H = Gy. 


Computing Modulo a Prime. The security of many cryptographic schemes 
is based on the discrete logarithm assumption, which says that 1 + g” mod p 
is a one-way function. Here p is a large prime and the base element g is 


1. either a primitive root modulo p, i.e., a generator of Z>, or 
2. it is an element of order q in Zj, i.e., a generator of the subgroup Gy of 
order q, and q is a (large) prime that divides p — 1. 


Examples of such schemes which we discuss in this book are ElGamal’s en- 
cryption and digital signatures, the digital signature standard DSS (see Sec- 
tion 3.5), commitment schemes (see Section 4.3.2), electronic elections (see 
Section 4.4) and digital cash (see Section 4.5). 

When setting up such schemes, generators g of Zi or Gq have to be 
selected. This can be difficult or even infeasible in the first case, because we 
must know the prime factors of p—1 in order to test whether a given element 
g is a primitive root (see Algorithm A.39 above). On the other hand, it is 
easy to find a generator g of Gy. We simply take a random element h € Z>, 
and set g := h'®-/4, The order of g divides q, because g? = h?—! = [I]. 
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Since q is a prime, we conclude that ord(g) = 1 or ord(g) = qg. Therefore, if 
g # (1), then ord(g) = q and g is a generator of Gy. 
To implement cryptographic operations, we have to compute in Z> or in 
the subgroup G,. The following rules simplify these computations. 
1. Let x € Z*. Then c* = 2”, if k= k’ mod (p— 1). 
In particular, 2* = x* mod (P—1) 
(p—1), and a—-* = gP-1-*. 
2. Let x € Z be an element of order gq, ie., x € Gg. Then ak = af’, if 
k = k’ mod q. 
In particular, x 
and. fs S-¢1-". 


, ie., exponents can be reduced by modulo 


k — yk modq i.e., exponents can be reduced by modulo q, 


The rules state that the exponents are added and multiplied modulo (p — 1) 
or modulo gq. The rules hold, because «?~' = [1] for « € Zz (Proposition 
A.24) and x4 = [1] for 2 € Gg, which implies that 


ghth(e-l) = gkgl (PY — yk (a?-1)' = e*[1]' = 2* for 2 € Z 


and af tha = gkgha — gk (x2)! = g*/1]' = 2* for x € Gy. 
These rules can be very useful in computations. For example, let x € Zj 
and k € {0,1,...,p — 2}. Then you can compute the inverse x~" of x* by 
raising x to the (p — 1 — k)-th power, x~* = a?—!~*, without explicitly 
computing an inverse by using, for example, the Euclidean algorithm. Note 
that (p — 1 — k) is a positive exponent. Powers of x are efficiently computed 
by the fast exponentiation algorithm (Algorithm A.26). 
In many cases it is also possible to compute the k-th root of elements in 
Zi. 
1. Let « € Zand k € N with ged(k,p—1) = 1, ie., k is a unit modulo p—1. 
Let k~! be the inverse of k modulo p—1, ie., k- k~! = 1 mod (p—1). 


kot 


_i\k 
Then (a ‘) we FP Se is a k-th root of x in Z5. 


2. Let x € Z> be an element of order q, ie., c € Gg, and k € N with 


1<k <q. Let k~! be the inverse of k modulo q, ie., k-k~! = 1 mod q. 
-1\k ES 
Then (2! ‘) =f, ie, x* “is a k-th root of x in Zs. 
It is common practice to denote the k-th root «* " by «!/*. 
You can apply analogous rules of computation to elements g” in any finite 


group G. Proposition A.23, which says that g!@! is the unit element, implies 
that exponents & are added and multiplied modulo the order |G| of G. 


A.5 Polynomials and Finite Fields 


A finite field is a field with a finite number of elements. In Section A.2, we 
met examples of finite fields: The residue class ring Z, is a field, if and only 
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if n is a prime. The fields Z,, p a prime number, are called the finite prime 
fields, and they are also denoted by F,. Finite fields are extensions of these 
prime fields. Field extensions are constructed by using polynomials. So we 
first study the ring of polynomials with coefficients in a field k. 


A.5.1 The Ring of Polynomials 


Let k[X] be the ring of polynomials in one variable X over a (not necessarily 
finite) field k. The elements of kLX] are the polynomials 


d 
F(X) = ap + aX + a2X? +...agX4 = S > a: X', 
1=0 


with coefficients a; € k,0O<i<d. 

If we assume that ag 4 0, then the leading term aqX® really appears in the 
polynomial, and we call d the degree of F’', deg(F’) for short. The polynomials 
of degree 0 are just the elements of k. 

The polynomials in k[X] are added and multiplied as usual: 


1. We add two polynomials F = S~4_,a;X* and G = °@_,b;X*, assume 
d <e, by adding the coefficients (set a; = 0 ford <i<e): 


F+G= So (aj + bi)X*. 
1=0 


2. The product of two polynomials F’ = sar a;X* and G = )\¥_) b:X* is 


de a 
F.G=)- ( ahr] B Gar 
k=0 


i=0 = 


With this addition and multiplication, k[|X] becomes a commutative ring with 
unit element. The unit element of k&[X] is the unit element 1 of k. The ring 
kX] has no zero divisors, i.e., if F and G are non-zero polynomials, then the 
product F’- G is also non-zero. 

The algebraic properties of the ring kX] of polynomials are analogous to 
the algebraic properties of the ring of integers. 

Analogously to Definition A.1, we define for polynomials fF’ and G what 
it means that F divides G and the greatest common divisor of F and G. The 
greatest common divisor is unique up to a factor c€ k,c #0, ie., if Aisa 
greatest common divisor of F' and G, then c- A is also a greatest common 
divisor, for c € k* = k \ {0}. 

A polynomial F is (relatively) prime to G if the only common divisors of 
F and G are the units k* of k. 

Division with remainder works as with the integers. The difference is that 
the “size” of a polynomial is measured by using the degree, whereas the 
absolute value was used for an integer. 
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Theorem A.41 (Division with remainder). Let F,G € k[|X],G #0. Then 
there are unique polynomials Q,R © k[X], such that F = Q-G+R and 
0 < deg(R) < deg(G). 


Proof. The proof runs exactly in the same way as the proof of Theorem A.2: 
Replace the absolute value with the degree. 


R is called the remainder of F modulo G. We write F mod G for R. The 
polynomial Q is the quotient of F and G. We write F div G for Q. 

You can compute a greatest common divisor of polynomials F' and G by 
using the Euclidean algorithm, and the extended Euclidean algorithm yields 
the coefficients C,D € k[X] of a linear combination 


ALCP 4 DG, 


with A a greatest common divisor of F' and G. 

If you have obtained such a linear combination for one greatest common 
divisor, then you immediately get a linear combination for any other greatest 
common divisor by multiplying with a unit from k*. 

In particular, if F is prime to G, then the extended Euclidean algorithm 
computes a linear combination 


1=C-F+D-G. 
We also have the analogue of prime numbers. 


Definition A.42. Let P € k[X], P ¢ k. P is called irreducible (or a prime) if 
the only divisors of P are the elements c € k* and c-P,c € k*, or, equivalently, 
if whenever one can write P = F'-G with F,G € k[X], then F € k* or 
G € k*. A polynomial Q € k[X] which is not irreducible is called reducible 
or a composite. 


As the ring Z of integers, the ring k[X] of polynomials is factorial, i.e., 
every element has a unique decomposition into irreducible elements. 


Theorem A.43. Let F € k[X],F 4 0, be a non-zero polynomial. There 
are pairwise distinct irreducible polynomials P,,...,P,,r > 0, exponents 
€1,---,er EN,e;, > 1,i=1,...,r, and a unit u € k*, such that 


F= uJ] Pe 
t=1 


This factorization is unique in the following sense: If 


s 


F=v][ of 


i=l 


is another factorization of F, then we have r = s, and after a permutation 
of the indices i we have Q; = ujP;, with u; € k*, ande; = f; for 1 <i<r. 
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Proof. The proof runs in the same way as the proof of the Fundamental 
Theorem of Arithmetic (Theorem A.12). 


A.5.2 Residue Class Rings 


As in the ring of integers, we can consider residue classes in k[X] and residue 
class rings. 


Definition A.44. Let P € k[X] be a polynomial of degree > 1: 
1. F,G € k[X] are congruent modulo P, written as 


F= Gmod P, 


if P divides F — G. This means that F and G have the same remainder 
when divided by P, i.e., F mod P= G mod P. 

2. Let F € k[X]. [F] := {G € k[X] | G = F mod P} is called the residue 
class of F modulo P. 


As before, “congruent modulo” is an equivalence relation, the equivalence 
classes are the residue classes, and the set of residue classes 


k[X]/Pk[X] := {[F] | F € k[X]} 


is a ring. Residue classes are added and multiplied by adding and multiplying 
a representative: 


[F]) +[G]:=(F+G), [F]-[G):=|F- Gl. 


We also have a natural representative of [F'], the remainder F’ mod P of 
F modulo P: [F] = [F mod P]. As remainders modulo P, we get all the 
polynomials which have a degree < deg(P). Therefore, we have a one-to-one 
correspondence between k[X]/Pk|X] and the set of residues {F' € k[X] | 
deg(F’) < deg(P)}. We often identify both sets: 


k[X]/PK[X] = {F € k[X] | deg(F) < deg(P)}. 


Two residues F' and G are added or multiplied by first adding or multiplying 
them as polynomials and then taking the residue modulo P. Since the sum 
of two residues F’ and G has a degree < deg(P), it is a residue, and we do 
not have to reduce. After a multiplication, we have, in general, to take the 
remainder. 


Addition: (F,G)-> F+G, Multiplication : (F,G) > F-G mod P. 


Let n := deg(P) be the degree of P. The residue class ring k|X]/Pk[X] is 
an n-dimensional vector space over k. A basis of this vector space is given by 
the elements [1], [X],[X?],...,[X"~'] . If k is a finite field with q elements, 
then k[X]/Pk[X] consists of g” elements. 
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Example. Let k = Fz = Zp = {0,1} be the field with two elements 0 and 1 
consisting of the residues modulo 2, and P := X°+ X4+X°+X+1€ kX]. 
The elements of k|X]/Pk[X] may be identified with the binary polynomials 
bp X™ + bg X® +...4+b,X + bo, bs € {0,1},0 <i < 7, of degree < 7. The ring 
k[X|/Pk[X] contains 2° = 256 elements. We have, for example, 

(X84 X34 X72 41)-(X° +X? +1) 

EY, eee Ge a ea a Gl | 

= X°.(X84 X44 X94 X 4141 

= 1mod (X®°+X4+ X° +X +1). 


Thus, X6+X°+4 X?+1 is a unit in k[X]/Pk[X], and its inverse is X°+X?+1. 


We may characterize units as in the integer case. 


Proposition A.45. An element [F] € k[X]|/Pk[X] is a unit if and only if F 
is prime to P. The multiplicative inverse [F|~+ of a unit [F] can be computed 
using the extended Euclidean algorithm. 


Proof. The proof is the same as the proof in the integer case (see Proposition 
A.16). Recall that the inverse may be calculated as follows: If F' is prime to 
P, then the extended Euclidean algorithm produces a linear combination 


C-F+D-.P=1, with polynomials C,D € k[X]. 


We see that C- F = 1 mod P. Hence, [C] is the inverse [F]~}. 


If the polynomial P is irreducible, then all residues modulo P, i.e., all 
polynomials with a degree < deg(P), are prime to P. So we get the same 
corollary as in the integer case. 


Corollary A.46. Let P be irreducible. Then every [F'] 4 [0] in k[X]/Pk[X] 
is a unit. Thus, k|X]/Pk[X] is a field. 


Remarks: 


1. Let P be an irreducible polynomial of degree n. The field & is a subset of 
the larger field kLX]/Pk[X]. We therefore call k[X]/Pk[X] an extension 
field of k of degree n. 

2. If P is reducible, then P = F’- G, with polynomials F,G of degree < 
deg(P). Then [F’] F [0] and [G] ¥ [0], but [F] - [G] = [P] = [0]. [F] and 
[G] are “zero divisors”. They have no inverse, and we see that k|X]/Pk[X] 
is not a field. 


A.5.3 Finite Fields 


Now, let k = Zp = Fy be the prime field of residues modulo p, p € Za 
prime number, and let P € F,[X] be an irreducible polynomial of degree 
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n. Then k[X]/Pk[X] = F,[X]/PF,[X] is an extension field of F,. It is an 
n-dimensional vector space over F,,, and it contains p” elements. 

In general, there is more than one irreducible polynomial of degree n over 
F,,. Therefore there are more finite fields with p” elements. For example, if 
Q € F,[X] is another irreducible polynomial of degree n, Q 4 cP for all c € k, 
then F,[X]/QF,[X] is a field with p” elements, different from k[X]/Pk[X]. 
But one can show that all the finite fields with p” elements are isomorphic 
to each other in a very natural way. As the mathematicians state it, up to 
canonical isomorphism, there is only one finite field with p” elements. It is 
denoted by Fyn or by GF(p").4 

If you need a concrete representation of Fp», then you choose an irre- 
ducible polynomial P € F,[X] of degree n, and you have Fp» = F,[X]/PF,[X]. 
But there are different representations, reflecting your degrees of freedom 
when choosing the irreducible polynomial. 

One can also prove that in every finite field &, the number |k| of elements 
in k must be a power p” of a prime number p. Therefore, the fields Fp» are 
all the finite fields that exist. 

In cryptography, finite fields play an important role in many places. For 
example, the classical ElGamal cryptosystems are based on the discrete log- 
arithm problem in a finite prime field (see Section 3.5), the elliptic curves 
used in cryptography are defined over finite fields, and the basic encryption 
operations of the Advanced Encryption Standard AES are algebraic opera- 
tions in the field Fj: with 2° elements. The AES is discussed in this book 
(see Section 2.2.2). This motivates the following closer look at the fields Fon. 

We identify Fo = Za = {0,1}. Let P= X” +ap_iX™14+...+a.X + 
ag, a; € {0,1},0 <i < n—1 be a binary irreducible polynomial of degree 
n. Then Fon = F,[X]/PF,[X], and we may consider the binary polynomials 
A= Gime er ee, Glace .. +6, X + bo of degree <n-1l (b; E {0, 1}, O0< 
i <n-—1) as the elements of Fyn. Adding two of these polynomials in Fn 
means to add them as polynomials, and multiplying them means to first 
multiply them as polynomials and then take the remainder modulo P. 

Now we can represent the polynomial A by the n-dimensional vector 
byn—1bn-2...b1bo of its coefficients. In this way, we get a binary represen- 
tation of the elements of Fon; the elements of Fan are just the bit strings of 
length n. To add two of these elements means to add them as binary vectors, 
i.e., you add them bitwise modulo 2, which is the same as bitwise XORing: 


bn—1bn—2 ane by bo + Cn—1Cn—2.---C1CO 


= (bn—1 ® Cn—1)(On-2 © Cn—2) ... (61 @ €1) (bo G Co). 


To multiply two elements is more complicated: You have to convert the bit 
strings to polynomials, multiply them as polynomials, reduce modulo P and 


4 Finite fields are also called Galois fields, in honor of the French mathematician 
Evariste Galois (1811-1832). 
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take the coefficients of the remainder. The 0-element of Fan is 00...00 and 
the 1-element is 00...001. 

In the Advanced Encryption Standard AES, encryption depends on al- 
gebraic operations in the finite field Fys. The irreducible binary polynomial 
P:= X®+ X44 X34 X +1 is taken to represent Fys as F2[X]/PF2[X] (we 
already used this polynomial in an example above). Then the elements of Fos 
are just strings of 8 bits. In this way, a byte is an element of Fos and vice 
versa. One of the core operations of AES is the so-called S-Box. The AES 
S-Box maps a byte x to its inverse 2~! in Fas and then modifies the result 
by an F,-affine transformation (see Section 2.2.2). We conclude this section 
with examples for adding, multiplying and inverting bytes in Fy. 


01001101 + 00100101 = 01101000, 
10111101 - 01101001 = 11111100, 
01001101 - 00100101 = 00000001, 

01001101~+ = 00100101. 


As is common practice, we sometimes represent a byte and hence an element 
of Fas by two hexadecimal digits. Then the examples read as follows: 


4D +25 = 68, BD -69 = FC, 4D -25 = 01, 4D7! = 25. 


A.6 Quadratic Residues 


We will study the question as to which of the residues modulo n are squares. 


Definition A.47. Let n € N and z € Z. We call that x is a quadratic residue 
modulo n if there is an element y € Z with x = y? mod n. Otherwise, x is 
called a quadratic non-residue modulo n. 


Examples: 


1. The numbers 0,1, 4,5,6 and 9 are the quadratic residues modulo 10. 
2. The numbers 0,1,3,4,5 and 9 are the quadratic residues modulo 11. 


Remark. The property of being a quadratic residue depends only on the 
residue class [x] € Z, of « modulo n. An integer x is a quadratic residue 
modulo n if and only if its residue class [2] is a square in the residue class 
ring Z», (i.e., if and only if there is some [y] € Z, with [a] = [y]*). The residue 
class [a] is often also called a quadratic residue. 


In most cases we are only interested in the quadratic residues x which are 
units modulo n (i.e., z and n are relatively prime, see Proposition A.16). 
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Definition A.48. The subgroup of Z* that consists of the residue classes 
represented by a quadratic residue is denoted by QR,,: 


QR,, = {{z] € Z* | There is a [y] € Z* with [2] = [y]?}. 


It is called the subgroup of quadratic residues or the subgroup of squares. The 
complement of QR,, is denoted by QNR,, := Z* \ QR,,. It is called the subset 
of quadratic non-residues. 


We give a criterion for determining the quadratic residues modulo a prime. 


Lemma A.49. Let p be a prime > 2 and g € Z, be a primitive root of Z;. 
Let x € Z}. Then [x] € QR,, ¢f and only if x = g’' mod p for some even 
number t,0 <<t<p-2. 


Proof. Recall that Zj is a cyclic group generated by g (Theorem A.36). If 
[x] € QR,, then « = y? modp, and y = g* mod p for some s. Then 2 = 
g°* mod p= g' mod p, with t := 2s mod (p—1) (the order of g is p—1) and 
0<t<p-—2. Since p— 1 is even, t is also even. 

Conversely, if « = g* mod p, and t is even, then x = (g'/?)? mod p, which 
means that x € QR,,. 


Proposition A.50. Let p be a prime > 2. Exactly half of the elements of Z;, 
are squares, t.e., |QR,| = (p= 1a. 


Proof. Since half of the integers x with 0 < x < p—2 are even, the proposition 
follows from the preceding lemma. 


Definition A.51. Let p be a prime > 2, and let x € Z be prime to p. 
(=) _ | +1 if [ez] € QR,, 
p/ | =1 if [2] € QR,, 


is called the Legendre symbol of x mod p. For x € Z with p|z, we set (z) = 0. 


Proposition A.52 (Euler’s criterion). Let p be a prime > 2, and let x € Z. 


Then 
(=) = x(?-1)/? mod p. 
Pp 


Proof. If p divides x, then both sides are congruent 0 modulo p. Suppose p 
does not divide x. Let [g] € Z5 be a primitive element. 


We first observe that g—))/? = —1 mod p. Namely, [g]®~)/? is a solution of 
the equation X*—1 over the field Z*. Hence, g®-/2 = +1 mod p. However, 
g'?-/? mod p # 1, because the order of [g] is p— 1. 
Let [2] = [g]',0 < t < p— 2. By Lemma A.49, [2] € QR, if and only if 
t is even. On the other hand, «~))/2 = g(P-))/2 = +1 mod p, and it is 
= | mod pif and only if t is even. This completes the proof. 


312 A. Algebra and Number Theory 


Remarks: 


1. The Legendre symbol is multiplicative in «: 


re) 

py Nee Koy 
This immediately follows, for example, from Euler’s criterion. It means 
that [xy] € QR,, if and only if either both [z], [y] € QR, or both [z], [y] ¢ 
QR,,. 

2. The Legendre symbol () depends only on x mod p, and the map 


is a homomorphism of groups. 


We do not give proofs of the following two important results. Proofs 
may be found, for example, in [HarWri79], [Rosen2000], [Koblitz94] and 
[Forster 96]. 


Theorem A.53. Let p be a prime > 2. Then: 


f (3) =~ (—1)@-D/2 a +1 ifp= 1 mod 4, 
, —1 ifp= 3mod4. 


2. (2) = (—1)@?-D/8 = +1 ifp= +1 mod 8, 
‘ -1 ifp= +3 mod 8. 


Theorem A.54 (Law of Quadratic Reciprocity). Let p and q be primes > 2, 


p#q. Then 
(2) (2) = (-1)0-DG-D/4, 
q Pp 


We generalize the Legendre symbol for composite numbers. 


Definition A.55. Let n € Z be a positive odd number and n = [J/_, ps’ be 
the decomposition of n into primes. Let x € Z. 


is called the Jacobi symbol of x mod n . 
Remarks: 


1. The value of (=) only depends on the residue class [2] € Zn. 
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2. If [x] € QR,,, then [2] € QR, for all primes p that divide n. Hence, 
(2) = 1. The converse is not true, in general. For example, let n = pq 


be the product of two primes. Then (4) = (4) : (s) can be 1, whereas 


n 


both (s) and (s) are —1. This means that x mod p (and x mod q), and 


hence x mod n are not squares. 
3. The Jacobi symbol is multiplicative in both arguments: 


ee ae aa 


4. The map Z* {1,—-1}, [2] + (4) is a homomorphism of groups. 
5. Jit! := {[x] € Z*| (£) = 1} is a subgroup of Z*. 


n 


Lemma A.56. Let n > 3 be an odd integer. If n is a square (in Z), then 
(2) =1 for all x. Otherwise, half of the elements of Z* have a Jacobi symbol 


n 


of 1, i.e., |Jt1| = e(m)/a. 


Proof. If n is a square, then the exponents e; in the prime factorization of n 
are all even (notation as above), and the Jacobi symbol is always 1. If n is not 
a square, then there is an odd e;, say e;. By the Chinese Remainder Theorem 
(Theorem A.29), we find a unit x which is a quadratic non-residue modulo 
p, and a quadratic residue modulo p; for 7 = 2,...,r. Then (2) = —1, and 


mapping [y] to [y- x] yields a one-to-one map from J*! to Z* \ J#!. 


Theorem A.57. Let n > 3 be an odd integer. Then: 
Fn ae me 
i -1 ifn= 3mod 4. 

+1 ifn = +1 mod 8, 


2; (2) 7 (-1)@ a ==; ifn = +3 mod 8. 


Proof. Let f(n) = (—1)(-/2 for statement 1 and f(n) = (—1)%’-0/8 
for statement 2. You can easily check that f(nin2) = f(ni)f(n2) for 
odd numbers n; and nz (for statement 2, consider the different cases of 
n1,nz mod 8). Thus, both sides of the equations (=+) = (—1)(~)/? and 


(2) = (—1)?-0)/ 8 are multiplicative in n, and the proposition follows from 


Theorem A.53. 


Theorem A.58 (Law of Quadratic Reciprocity). Let n,m > 3 be odd inte- 


gers. Then 
(“) = (—1)(-D(m-)/4 (=) 


m 
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Proof. If m and n have a common factor, then both sides are zero by the 
definition of the symbols. So we can suppose that m is prime to n. We write 
Mm = pip2...pr and n = qiq2...qds aS a product of primes. Converting from 
(2) = I. (z) to (4) = Ti, (“), we apply the reciprocity law for 
the Legendre symbol (Theorem A.54) for each of the factors. We get rs 
multipliers ¢;; = (—1):~)(—-D/4_ As in the previous proof, we use that 
f(n) = (-1)-/? is multiplicative in n and get 


(qj-1)/2 
[[-pe-P@-4 = II (TI-ne-» 


ing j i 
(m=1)/2 
<__ = HK ((-) A os a) ae = [[—p@-?? 
j 
2 (cayenne? = (=1) "Dem 1)/4, 


as desired. 


Remark. Computing a Jacobi symbol (#) simply by using the definition 
requires knowing the prime factors of n. No algorithm is known that can 
compute the prime factors in polynomial time. However, using the Law of 
Quadratic Reciprocity (Theorem A.58) and Theorem A.57, we can efficiently 
compute (+) using the following algorithm, without knowing the factoriza- 


3 nm 
tion of n. 


Algorithm A.59. 
Jac(int m,n) 
1 Replace m by m mod n. 
2 If m=0, then (™) =0, and if m= 1, then (“) =1. 
3 Else, set m = 2'r, with r odd. 
Compute () a 4)" by Theorem A.57. 
If r = 1, we are finished. 
4 If r > 3, we still need to compute (4). 
Apply the Law of Quadratic Reciprocity and compute (2 ) by (2). 
5 Now set m=n and n=r and go to 1. 


We have r < m mod n, and (n,r) becomes the pair (m,n) in the next itera- 
tion. An analysis similar to that of the Euclidean algorithm (Algorithm A.4) 
shows that the algorithm terminates after at most O(log,(n)) iterations (see 
Corollary A.10). 


Example. We want to determine whether the prime 7331 is a quadratic 
residue modulo the prime 9859. For this purpose, we have to compute the 
Legendre symbol (G3) and could do that using Euler’s criterion (Proposi- 
tion A.52). However, applying Algorithm A.59 is much more efficient: 
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(ec 9859\ _ 2528\ _ xSP 79 
9859) 7331) 7331) 7331 7331 
Soca FON oof Oe = T3Ly. 63 
a 7331)  \ 7331) — 79 ) 79 
ee chee oh ey a 
~ \63 63/ \63)/ ~~ 


Thus, 7331 is a quadratic residue modulo 9859. 


A.7 Modular Square Roots 


We now discuss how to get the square root of a quadratic residue. Computing 
square roots modulo n can be a difficult or even an infeasible task if n is a 
composite number (and, e.g., the Rabin cryptosystem is based on this; see 
Section 3.6). However, if n is a prime, we can determine square roots using 
an efficient algorithm. 


Proposition A.60. There is a (probabilistic) polynomial algorithm Sqrt 
which, given as inputs a prime p and an a € QR,,, computes a square root 
a € Z* of a: Sqrt(p,a) =x and x =a (in Zp). 


Remarks: 


1. The square roots of a € QR,, are the solutions of the equation X?-a=0 
over Z,. Hence a has two square roots (for p > 2). If x is a square root, 
then —z is the other root. 

2. “Probabilistic”? means that random choices? are included in the algo- 
rithm. Polynomial means that the running time (the number of binary 
operations) of the algorithm is bounded by a polynomial in the binary 
length of the inputs. Sqrt is a so-called Las Vegas algorithm, i.e., we 
expect Sgrt to return a correct result in polynomial time (for a detailed 
discussion of the notion of probabilistic polynomial algorithms, see Chap- 
ter 5). 


Proof. Let a € QR,. By Euler’s criterion (Proposition A.52), a(p—1)/2 — 4, 
Hence a\?+)/? — a. We first consider the (easy) case of p= 3 mod 4. Since 4 
divides p+ 1, (P+ 1)/4 is an integer, and x := a'’+))/4 is a square root of a. 
Now assume p = 1 mod 4. The straightforward computation of the square 
root as in the first case does not work, since (P + 1)/9 is not divisible by 2. We 
choose a quadratic non-residue b € QNR,, (here the random choices come into 
play, see below). By Proposition A.52, b?—)/? = —1. We have a’—))/? = 1, 
and (p—1)/2 is even. Let (P— 1)/g = 2!'r, with r odd and 1 > 1. We will 


5 


5 In this chapter, all random choices are with respect to the uniform distribution. 
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compute an exponent s, such that a’b?* = 1. Then we are finished. Namely, 
a’+1p25 — @ and a\"+1)/20s is a square root of a. 

We obtain s in / steps. The intermediate result after step 7 is a representation 
a2’ . p25) = 1. We start with a-/2 . p0 = g®-)/2 = 1 and sq = 0. Let 
y; =a?" - b?5!, In the i-th step we take the square root yf := a2" » bS-1 
of yj. =a?" - b?5'-1, The value of y; is either 1 or —1. If y; = 1, then we 
take y; := y}. If yi = —1, then we set yj := y/- b-/?. The first time that 
b appears with an exponent > 0 in the representation is after the first step 
(if ever), and then b’s exponent is (P — 1)/2 = 2'r. This implies that s;_1 is 
indeed an even number for 2 = 1,...,1. 

Thus, we may compute a square root using the following algorithm. 


Algorithm A.61. 
int Sgrt(int a, prime p) 

1 ifp= 3mod4 

2 then return at) 4iv 4 mod p 

3 else 

4 randomly choose 6 € QNR,, 

i) i (p—1)div2; 7 —0 

6 repeat 
7 i—idiv 2; 7 — j div 2 
8 if a’b? = —1 mod p 
9 then j — j + (p—1) div2 
0 until i= 1 mod 2 
1 


1 
1 return a+) div 2 63 div2 mod p 


In the algorithm we get a quadratic non-residue by a random choice. For 
this purpose, we randomly choose an element b of Z;, and test (by Euler’s 
criterion) whether b is a non-residue. Since half of the elements in Z* are 
non-residues, we expect (on average) to get a non-residue after 2 random 
choices (see Lemma B.12). 


Now let n be a composite number. If n is a product of distinct primes 
and if we know these primes, we can apply the Chinese Remainder Theorem 
(Theorem A.29) and reduce the computation of square roots in Z* to the 
computation of square roots modulo a prime. There we can apply Algorithm 
A.61. We discuss this procedure in detail for the RSA and Rabin settings, 
where n = pq, with p and q being distinct primes. The extended Euclidean 
algorithm yields numbers d, e € Z with 1 = dp+eq. By the Chinese Remainder 
Theorem, the map 


Wp,q: Zn —> Zy X Zq, [x] > ([z mod pl], [x mod q]) 
is an isomorphism. The inverse map is given by 


Xp,q: Zp X Zg — Zp, ([£1],[%2]) > [dpaxe + equi]. 
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Addition and multiplication in Z, x Zg are done component-wise: 


({21], [ava]) + ([24], farg]) = (Ler + 24], [ata + 9]), 


[x]? = [a] if and only if [21]? = [a1] and [x2]? = [ag] 


(see the remark after Theorem A.29). Thus, in order to compute the roots of 
[a], we can compute the square roots of [a;] and [a2] and apply the inverse 
Chinese remainder map Xp,q. In Z, and Zy, we can efficiently compute square 
roots by using Algorithm A.61. 

Recall that Z, and Z, are fields, and over a field a quadratic equation 
X? — [a;] = 0 has at most 2 solutions. Hence, [ai] € Z, has at most two 
square roots. The zero-element [0] has the only square root [0]. For p = 2, 
the only non-zero element [1] of Zz has the only square root [1]. For p > 2, a 
non-zero element [a;] has either no or two distinct square roots. If [x] is a 
square root of [a;], then —[21] is also a root and for p > 2, [a] 4 —[21]. 

Combining the square roots of [a1] in Z, with the square roots of [ag] in 
Zq, we get the square roots of [a] in Z,. We summarize the results in the 
following proposition. 


Proposition A.62. Let p and q be distinct primes > 2 and n := pq. Assume 
that the prime factors p and q of n are known. Then the square roots of a 
quadratic residue [a] € Z, can be efficiently computed. Moreover, if [a] = 
({a1], [a2])°, then: 


1. [a] = ([x1], [v2]) ts a square root of [a] in Z, if and only if [x1] is a square 
root of [ai] in Zp and [x2] is a square root of [a2] in Zy. 

2. If [v1] and —|x1] are the square roots of [a1], and [x2] and —|[x] are 

the square roots of [a2], then [u] = ({x1], [ve]), [vo] = ([x1], —[x2]), —[v] = 

(—[a1], [z2]) and —[u] = (—[21], —[x2]) are all the square roots of [al]. 

[0] is the only square root of [0] = ((0], [0]). 

4. If [ai] # [0] and [az] ¥ [0], which means that [a] is a unit in Z,, then 
the square roots of [a] given in 2. are pairwise distinct, t.e., [a] has four 
square roots. 

5. If [a] = [0] (i.e., p divides a) and [az] £ [0] (i.e., q does not divide a), 
then [a] has only two distinct square roots, [u] and |v]. 


ve 


Remark. If one of the primes is 2, say p = 2, then statements 1-3 of Proposi- 
tion A.62 are also true. Statements 4 and 5 have to be modified. [a] has only 
one or two roots, because [x1] = —[x,]. If [a2] = 0, then there is only one 
root. If [az] 4 0, then the roots [u] and [v] are distinct. 


6 We identify Zn with Zp, x Zq via the Chinese remainder isomorphism p,q. 
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Conversely, the ability to compute square roots modulo n implies the 
ability to factorize n. 


Lemma A.63. Let n := pq, with distinct primes p,q > 2. Let [ul] and [v] be 
square roots of [a] € QR,, with [u] A £[v]. Then the prime factors of n can 
be computed from [u] and [v] using the Euclidean algorithm. 


Proof. We have n|u? — v2 = (u+v)(u—v), but n does not divide u+v and 
n does not divide u — v. Hence, the computation of gcd(u + v,n) yields one 
of the prime factors of n. 


Proposition A.64. Let I := {n © N | n = pq, p,q distinct primes}. Then 
the following statements are equivalent: 


1. There is a probabilistic polynomial algorithm A, that on inputs n € I and 
a € QR,, returns a square root of a in Zi. 

2. There is a probabilistic polynomial algorithm Ag that on input n € I 
yields the prime factors of n. 


Proof. Let A; be a probabilistic polynomial algorithm that on inputs n and a 
returns a square root of a modulo n. Then we can find the factors of n in the 
following way. We randomly select an x € Z* and compute y = Aj(n, 27). 
Since a has four distinct roots by Proposition A.62, the probability that 
xc # ty is 1/2. If « 4 +y, we easily compute the factors by Lemma A.63. 
Otherwise we choose a new random x. We expect to be successful after two 
iterations. 

Conversely, if we can compute the prime factors of n using a polyno- 
mial algorithm A2, we can also compute (all the) square roots of arbitrary 
quadratic residues in polynomial time, as we have seen in Proposition A.62. 


The Chinese Remainder isomorphism can also be used to determine the 
number of quadratic residues modulo n. 


Proposition A.65. Let p and q be distinct primes, and n := pq. Then 
|QR,,| = (@- 1)(a- Dj. 


Proof. [a] = ([a:], [a2]) € QR,, if and only if [a1] € QR, and [a2] € QR,. By 
Proposition A.50, |QR,| = (» ~ 1)/2 and |QR,| = (¢~ D/a. 


Proposition A.66. Let p and q be distinct primes with p,q = 3mod4, 
and n := pq. Let [a] € QR,,, and [u] = ([#1], [x2]), [v] = ([@1], —[v2]), —[v] = 
(—[x1], [v2]) and —[u] = (—[x1],—[x2]) be the four square roots of [a] (see 
Proposition A.62). Then: 


DG) 
2. One and only one of the four square roots is in QR,,. 
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Proof. 1. We have u= vmod p and u = —v mod q, hence we conclude by 
Theorem A.53 that 


Calg ge Gl cro aa) 


2. By Theorem A.53, (+ a (+ = —1. Thus, exactly one of the roots 
[v1] or —[x1] is in QR,,, say [21], and exactly one of the roots [x2] or —[x9] is 
in QR,, say [x2]. Then [u] = ([x1], [v2]) is the only square root of [a] that is 
in QR,,. 
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Theorem A.67 (Euclid’s Theorem). There are infinitely many primes. 


Proof. Assume that there are only a finite number of primes p,,...,p,. Let 
n=1+p,-...-p,. Then p; does not divide n, 1 <7 <r. Thus, either nis a 
prime or it contains a new prime factor different from p;, 1 <i<_r. This is 
a contradiction. 


There is the following famous result on the distribution of primes. It is 
called the Prime Number Theorem and was proven by Hadamard and de la 
Vallée Poussin. 


Theorem A.68. 
Let r(x) = |{p prime | p < x}|. Then for large x, 


A proof can be found, for example, in [HarWri79] or [Newman80]. 


Corollary A.69. The frequency of primes among the numbers in the mag- 
nitude of x is approximately Tin(a): 


Sometimes we are interested in primes which have a given remainder c 
modulo 6. 


Theorem A.70 (Dirichlet’s Theorem). Let b,c € N and gcd(b,c) = 1. 
Let m,c(x) = [{p prime |p < a,p=kb+c,k © N}|. Then for large x, 
1 x 
T,c(x) & —~ : 
(b) In(x) 
Corollary A.71. Let b,c € N and gcd(b,c) = 1. The frequency of primes 
among the numbers a with a mod b = c in the magnitude of x is approximately 


°/.p(b) n(x): 
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Our goal in this section is to give criteria for prime numbers which can 
be efficiently checked by an algorithm. We will use the fact that a proper 
subgroup H of Z* contains at most |2Zn|/2 = 9(")/2 elements. More generally, 
we prove the following basic result on groups. 


Proposition A.72. Let G be a finite group and H C G be a subgroup. Then 
|H| is a divisor of |G|. 


Proof. We consider the following equivalence relation (~) on G: 9g, ~ g2 
if and only if q1 - gy ' € H. The equivalence class of an element g € G is 
gH = {gh|he€ H}. Thus, all equivalence classes contain the same number, 
namely |H|, of elements. Since G is the disjoint union of the equivalence 
classes, we have |G| = |H|- 1, where r is the number of equivalence classes, 
and we see that |H| divides |G]. 


Fermat’s Little Theorem (Theorem A.24) yields a necessary condition 
for primes. Let n € N be an odd number. If n is a prime and a € N with 
gcd(a,n) = 1, then a”~! = 1 mod n. If there is an a € N with gced(a,n) = 1 
and a"! # 1 mod n, then n is not a prime. 

Unfortunately, the converse is not true: there are composite (i.e., non-prime) 
numbers n, such that a”~! = 1 mod n, for all a € N with ged(a,n) = 1. The 
smallest n with this property is 561 = 3-11-17. 


Definition A.73. Let n € N,n > 3, be a composite number. We call n a 
Carmichael number if 
a”-+ = 1modn, 


for all a € N with gcd(a,n) = 1. 


Proposition A.74. Let n be a Carmichael number, and let p be a prime that 
divides n. Then p? does not divide n. In other words, the factorization of n 
does not contain squares. 


Proof. Assume n = p*m, with k > 2 and p does not divide m. i b:= 1+pm. 
From b? = (1+ pm)? =1+p?-a we derive that b? = 1 mod p?. Since p does 
Bot divide m, we have b# 1 mod p? and conclude that b has order p modulo 
p*. Now, b is prime to n, because it is prime to p and m, and n is a Carmichael 
number. Hence b”~! = 1 mod n, and then, in particular, b°-! = 1 mod p?. 


Thus p|n—1 (by Lemma A.32), a contradiction to p|n 


Proposition A.75. Let n be an odd, composite number that does not contain 
squares. Then n is a Carmichael number if and only if (p — 1) |(n—1) for 
all prime divisors p of n. 


Proof. Let n = p,-...+ pr, with p; being a prime, i = 1,...,r, and pj F p; 
for i 4 j. n is a Carmichael number if and only if a’~! = 1 mod n for all 
a that are prime to n and, by the Chinese Remainder Theorem this in turn 


is equivalent to a’~! = 1mod p,, for all a which are not divided by p;, 
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i=1,...,r. This is the case if and only if (pj —1) | (n—1), fori =1,...,r. 
The last equivalence follows from Proposition A.23 and Corollary A.33, since 


Z;,, is a cyclic group of order p; — 1 (Theorem A.36). 


Corollary A.76. Every Carmichael number n contains at least three distinct 
primes. 


Proof. Assume n = py: po, with py < po. Then n—1 = pi(p2—1)+ (p1-1) = 
(pi: — 1) mod (pz —1). However, (p; — 1) # 0 mod (p2 —1), since 0 < py -1 < 
p2 — 1. Hence, pz — 1 does not divide n — 1. This is a contradiction. 


Though Carmichael numbers are extremely rare (there are only 2163 
Carmichael numbers below 25 - 10°), the Fermat condition a’~' = 1 mod p 
is not reliable for a primality test.” We are looking for other criteria. 

Let n € N be an odd number. If n is a prime, by Euler’s criterion (Propo- 
sition A.52), (4) = a? mod n for every a € N with ged(a,n) = 1. Here 


the converse is also true. More precisely: 


Proposition A.77. Let n be an odd and composite number. Let 
(=) # a'-V/? mod n} : 
n 


Then |E,,| > °(")/2, i.e., for more than half of the [a], we have 
(2) £ aD? mod n. 


En = {[al € 24 


Proof. Let En := Z* \ E, be the complement of E,. We have 
E,, = {[a] € Z* | (4) = a9? mod n} = {[a] € Ze | a D/? . me = 


1 mod n}. Since E,, is a subgroup of Z*, we could infer |E,,| < ¥(")/2 if E, 
were a proper subset of Z* (Proposition A.72). Then |E,,| = |Z*| — |E,| > 
y(n) — P(m/g = 9(r)/o. 

Thus it suffices to prove: if E,, = Z*, then n is a prime. 

Assume E£,, = Z* and n is not a prime. From (£) = a\"-))/? mod n, it follows 
that a’—' = 1 mod n. Thus n is a Carmichael number, and hence does not 
contain squares (Proposition A.74). Let n = p,-...-pr,k > 3, be the decom- 


position of n into distinct primes. Let [v] © Zf, be a quadratic non-residue, 


ie., (+) = —1. Using the Chinese Remainder Theorem, choose an [x] € Z*, 
with «= vmod p, and x = 1 mod %/p,. Then (2) = (2) Baila (=) =-l. 
Since E, = Z*, (2) = x-/? mod n, and hence 2~)/? = —1 mod n, in 
particular 2—)/2 = —1 mod po. This is a contradiction. 


The following considerations lead to a necessary condition for primes that 
does not require the computation of Jacobi symbols. Let n € N be an odd 
number, and let n — 1 = 2'm, with m odd. Suppose that n is a prime. Then 
Zn, is a field (Corollary A.17), and hence +1 are the only square roots of 1, 


n-1 


” Numbers n satisfying a 1 mod n are called pseudoprimes for the base a. 
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™-l = | mod n for 


i.e., the only solutions of X? — 1, modulo n. Moreover, a 
every a € N that is prime to n (Theorem A.24). Thus 
a”’-! = 1 mod nand a-))/2 = +1 mod n, and 

if a("-)/? = 1 mod n, then a—)/4 = +1 mod n, and 
if a"—-1)/4 = 1 mod n, then.... 

We see: if n is a prime, then for every a € N with gcd(a,n) = 1, either 
a™ = 1modn, or there is a j € {0,...,t — 1} with a?’™ = —1 mod n. The 
converse is also true, i.e., if n is composite, then there exists an a € N with 
gcd(a,n) = 1, such that a” # 1 modn anda?” # —1 mod n for 0 <j < 
t — 1. More precisely: 


Proposition A.78. Let n € N be a composite odd number. Let n—1 = 2'm, 
with m odd. Let 


W,, = {[a] € Z* | a” 4 1modn and a2™ # —lmodn for0<j<t-—lI}. 
Then |W,| > 9(?)/2. 


Proof. Let W, := Z* \W,, be the complement of W,,. We will show that W,, 
is contained in a proper subgroup U of Z*. Then the desired estimate follows 
by Proposition A.72, as in the proof of Proposition A.77. We distinguish two 
cases. 

Case 1: There is an [a] € Z* with a”~! 4 1 mod n. 

Then U = {[{a] € Z* | a”~! = 1 mod n} is a proper subgroup of Z*, which 
contains W,,, and the proof is finished. 

Case 2: We have a"~! = 1 mod n for all [a] € Z*. 

Then n is a Carmichael number. Hence n does not contain any squares 
(Proposition A.74). Let n = p,-...- pp, k > 3, be the decomposition into 
distinct primes. We set 


W; = {[a] € Z* | a2" = —1 mod n}. 
W? is not empty, since [—1] € WY. Let r = max{i | Wi #0} and 


U := {{a] € Z* | a? ™ = +1 mod n}. 


U is a subgroup of Z* and W, C U. Let [a] € W7. Using the Chinese 
Remainder Theorem, we get a [w] € Z* with w = amod p; and w= 1 mod 
/p,. Then w?’™ = —1mod p; and w?’™ = +1 mod po, hence w?'™ F 
+1 mod n. Thus w ¢ U, and we see that U is indeed a proper subgroup of 
Zn. 


Remark. The set E,, from Proposition A.77 is a subset of W,,, and it can even 
be proven that |W,,| > 2y(n) (see, e.g., [Koblitz94]). 
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Probabilistic Primality Tests. The preceding propositions are the basis 
of probabilistic algorithms which test whether a given odd number is prime. 
Proposition A.77 yields the Solovay-Strassen primality test, and Proposition 
A.78 yields the Miller-Rabin primality test. The basic procedure is the same 
in both tests: we define a set W of witnesses for the fact that n is composite. 
Set W := E,, (Solovay-Strassen) or W := W,, (Miller-Rabin). If we can find 
aw €W, then W £9, and n is a composite number. 
To find a witness w € W, we randomly choose (with respect to the uni- 
form distribution) an element a € Z*, and check whether a € W. Since 
|W| > ¥(7)/2 (Propositions A.77 and A.78), the probability that we get a 
witness by the random choice, if n is composite, is > 1/2. By repeating the 
random choice k times, we can increase the probability of finding a witness if 
n is composite. The probability is then > 1—1/2*. If we do not find a witness, 
n is considered to be a prime. 

The tests are probabilistic and the result is not necessarily correct in all 
cases. However, the error probability is < 1/9* and hence very small, even for 
moderate values of k. 


Remark. The primality test of Miller-Rabin is the better choice: 


1. The test condition is easier to compute. 

2. A witness for the Solovay-Strassen test is also a witness for the Miller- 
Rabin test. 

3. In the Miller-Rabin test, the probability of obtaining a witness by one 
random choice is > 3/4 (we only proved the weaker bound of 1/2). 


Here follows the algorithm for the Miller-Rabin test (with error probability 
< Ya"). 


Algorithm A.79. 
boolean MillerRabinTest(int n, k) 


1 ifn is even 

2 then return false 

3 m<(n—-1)div2;t-—1 

4 while m is even do 

5) m—mdiv2;tc—t+1 

6 fori<—1tokdo 

7 a — Random() mod n 

8 u—a™ mod n 

9 ifuAl 
10 then j — 1 
11 while u #4 —1 and j < tdo 
12 uu? modn; 7 —j+1 
13 ifu4z-1 
14 then return false 


15 return true ; 


B. Probabilities and Information Theory 


We review some basic notions and results using in this book concerning prob- 
ability, probability spaces and random variables. This chapter is also intended 
to establish notation. There are many textbooks on probability theory, in- 
cluding [Bauer96], [Feller68], [Gan Ylv67], [Gordon97] and [Rényi70]. 


B.1 Finite Probability Spaces and Random Variables 


First we summarize basic concepts and notations. At the end of this section 
we derive a couple of elementary results. They are useful in our computations 
with probabilities. We consider only finite probability spaces. 


Definition B.1. 


1. A probability distribution (or simply distribution) p = (pi,...,Pn) is a 
tuple of elements pj € R,O < p; < 1, called probabilities, such that 
ia Pi = 1 

2. A probability space (X, px) is a finite set X = {x1,..., 2%} equipped with 
a probability distribution px = (p1,...,Pn). pi is called the probability 
of 43,1 <i <n. We also write px(a;) := p; and consider px as a map 
X — [0,1], called the probability measure on X, associating with « © X 
its probability. 

3. An event E in a probability space (X,px) is a subset € of X. The prob- 
ability measure is extended to events: 


px(€) = S > px(y)- 


yEeE 


Example. Let X be a finite set. The uniform distribution px ., is defined by 
Px,u(x) := 1/|x|, for all « € X. All elements of X have the same probability. 


Notation. If the probability measure is determined by the context, we of- 
ten do not specify it explicitly and simply write X instead of (X,px) and 
prob(x) or prob(€) instead of px(«) or px(E€). If E and F are events, we 
write prob(€, F) instead of prob(€ MF) for the probability that both events 
€ and F occur. Separating events by commas means combining them with 
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AND. The events {x} containing a single element are simply denoted by z. 
For example, prob(a,€) means the probability of {a}N€ (which is 0 if x ¢ € 
and prob(x) otherwise). 


Remark. Let X be a probability space. The following properties of the prob- 
ability measure are immediate consequences of Definition B.1: 


1. prob(X) = 1, prob(@) = 0. 
2. prob(.A U B) = prob(A) + prob(B), if A and B are disjoint events in X. 
3. prob(X \ A) = 1 — prob(A). 


Definition B.2. Let S : X —>+ Y bea map from a probability space (X, px) 
to a set Y. Then S and px induce a distribution S(px) on Y: 


S(px)(y) = px(S*(y)) = px({x € X | S(x) = y}). 
The distribution S(px) is called the image (distribution) of px under S. 


Definition B.3. Let (X,px) be a probability space. A map S$: X —> Y is 
called a Y-valued random variable on X. The distribution ps of a random 
variable S is the image of px under S: 


ps(y) = S(px)(y) = px ({x € X | S(x) = y}) for ye Y. 


We call S a real-valued random variable if Y C R, and a binary random 
variable if Y = {0,1}. Binary random variables are also called Boolean pred- 
cates. 


Remark. Considering the distribution of a random variable S': X —> Y 
means considering the distribution of the probability space induced as image 
on Y by S. 

Vice versa, it is sometimes convenient (and common practice in probabil- 
ity theory) to look at a probability space as if it were a random variable. 
Namely, consider (X,px) as a random variable Sx that is defined on some 
not further-specified probability space (Q,pq) and samples over X according 
to the given distribution: 


Sx : 22 —+ X and Px = Sx(pq). 


Definition B.4. Let S be a real-valued random variable on a finite proba- 
bility space X. The probabilistic average 


E(S) := S- prob(z) - S(z) 
LEX 


is called the expected value of S. 


Remark. The expected value E(S) is a weighted average. The weight of each 
value is its probability. 
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Definition B.5. Let (X,px) be a probability space and A,B C X be events, 
with px(B) > 0. The conditional probability of A assuming B is 


A, B) 

A|B) := px(A,B) ; 

In particular, we have 
px (2) ifx € B, 
0 ifa ZB. 


The conditional probabilities py (a|B),« € X, define a probability distribu- 
tion on X. They describe the probability of 7 assuming that event B occurs. 

If C is a further event, then px(A|6,C) is the conditional probability 
of A assuming BC. Separating events by commas in a condition means 
combining them with AND. 


Definition B.6. Let A,B C X be events in a probability space (X, px). 
A and B are called independent if and only if prob(A, B) = prob(A)-prob(B). 
If prob(B) > 0, then this condition is equivalent to prob(A|B) = prob(A). 


Definition B.7. A probability space (X,px) is called a joint probability 
space with factors (X1,p1),..-,(X,,p,), denoted for short by X1X2...X;,, 
if: 


1. The set X is the Cartesian product of the sets X1,...,X;: 
X= XxX XQX...X Xp. 
2. The distribution p;,1 <i<_r, is the image of px under the projection 
m2 X —> Xj, (M1,...,Lp) > vi, 
which means 
pi(z) = px(m, ‘(a)), for l <i<rand ae Xj. 
The probability spaces X1,...,X; are called independent if and only if 
px(11,...,%r) = [[2:(o), for all (a1,...,a,-) € X 
i=1 


(or, equivalently, if the fibers 7; '(a;),1 <i <r,! are independent events in 
X, in the sense of Definition B.6). In this case X is called the direct product 
of the X;, denoted for short by X = X, x X_ x... x X;. 


' The set 1~'(x) of pre-images of a single element x under a projection map 7 is 
called the fiber over x. 
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Analogously, random variables S,...,S, are called jointly distributed if 
there is a joint probability distribution pg of $),...,S,: 


prob($} = 71, So = L2,-655 S; => Ly) => ps(x1, sae Ba): 


They are called independent if and only if 
prob($) = 21, $2 = %,...,$, =a2,) = [[ preb(s; = 2), 


for all x1,..., 2p. 


Remark. Recall that the distribution of a random variable may be considered 
as the distribution of a probability space, and vice versa (see above). In 
this way, joint probability spaces correspond to jointly distributed random 
variables. 


Notation. Let (XY,pxy) be a joint probability space with factors (X, px) 
and (Y,py). Let « € X and y € Y. Then we denote by prob(y|x) the con- 
ditional probability pxy((z,y)|{z} x Y) of (a, y), assuming that the first 
component is «. Thus, prob(y|2) is the probability that y occurs as the sec- 
ond component if the first component is x. 


Remark. In this book we often meet joint probability spaces in the following 
way: a set X and a family W = (W,)ecx of sets are given. Then we may 
join the set X with the family W, to get 


XMW :={(z,w)| ce X,wew,}= J {2} x We. 
rEX 


The set W, becomes the fiber over x. 
Assume that probability distributions px on X and pw, on Wz,2 € X, 
are given. Then we get a joint probability distribution pxyw on X > W using 


Pxw(,W) := px(z)- pw, (w). 


Conversely, given a distribution pxw on X ™ W, we can project X ™ W to 
X. We get the image distribution px on X using 


px(#):= SY pxw(a,w) 


wew, 


and the probability distributions pw, on W,,2 € X, using 


pw, (w) = pxw((2,w)| {x} x We) = Pxw(®)/py (a). 


The probabilities pw, are conditional probabilities. 
prob(w|z) := pw,(w) is the conditional probability that w occurs as the 
second component, assuming that the first component 2 is given. 
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We write XW for short for the probability space (X ™ W,pxw), as 
before, and we call it a joint probability space. At first glance it does not 
meet Definition B.7, because the underlying set is not a Cartesian product 
(except in the case where all sets W, are equal). However, all sets are assumed 
to be finite. Therefore, we can easily embed all the sets W, into one larger 
set W. Then X o W C X x W, and pxw may also be considered as a 
probability distribution on X x W (extend it by zero, so that all elements 
outside X > W have a probability of 0). In this way we get a joint probability 
space XW in the strict sense of Definition B.7. XW and XW are practically 
the “same” as probability spaces (the difference has a measure of 0). 

As an example, think of the domain of the modular squaring one- 
way function (Section 3.2) (n,z) + x? modn, where n € I, = {pq | 
p,q distinct primes, |p| = |q| = &} and a € Z,, and assume that the moduli 
n (the keys, see Rabin encryption, Section 3.6.1) and the elements x (the 
messages) are selected according to some probability distributions (e.g. the 
uniform ones). Here X = I, and W = (Zp)nezr,.- 

The joining may be iterated. Let X, = (X1,pi) be a probability space 
and let Xj = (Xj,0,Pj,2)eeX1p<...0Xj-1»2 <j <1, be families of probability 
spaces. Then by iteratively joining the fibers we get a joint probability space 
(X1Xq... Xp, Px, Xo...X,): 


Notation. We introduce some notation which turns out to be very useful in 
many situations. 

Let (X,px) be a probability space and B: X —> {0,1} be a Boolean 
predicate. Then 


prob(B(x) =1: 2% X):=px({x € X | B(x) = 1)). 


The notation suggests that px({x € X | B(x) = 1}) is the probability for 
B(x) = 1if x is randomly selected from X according to px. If the distribution 
px is clear from the context, we simply write 


prob(B(z) =1:a<— X) 
instead of prob(B(x) : «  X). If px is the uniform distribution, we write 
prob(B(a) =1:2 X). 


Sometimes, we denote the probability distribution on X by « — X and 
the uniform distribution by  “ X. We emphasize in this way that the 
members of X are chosen randomly. 

If Y c X and py is a distribution on Y, then the notation « — Y is not 
only used for the distribution py, but also for the image of py on X (under 
the inclusion map). This intuitively suggests that elements x are randomly 
chosen from the subset Y, whereas elements outside of Y have zero probability 
and are not chosen. 
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As before, let S(px) denote the image distribution under a map 
S:X — Y. Then 


prob(S(x) = y:x2— X) = S(px)(y) = px({z € X | S(x) = y}), for ye Y. 


Sometimes we denote the image distribution by {S(x) : 2 — X}. This nota- 
tion indicates that the probability measure is concentrated on the image of 
X — only the elements S(z) in Y can have a probability > 0 — and that the 
probability of y € Y is given by the probability for y appearing as S(x), if x 
is randomly selected from X. 

Now let (XW, pxw) be a joint probability space, W = (W,)x¢x, as in- 
troduced above, and let px and pw,,x € X, be the probability distributions 
induced on X and the fibers W,. We write 


PWr 


prob(B(x,w) =1:2** X,w’ W,), or simply 
prob(B(z,w) =1:a2— X,w< W,) 


instead of pxw({(a,w) | B(a,w) = 1}). Here, B is a Boolean predicate on 

X & W. The notation suggests that we mean the probability for B(a, w) = 1, 

if first x is randomly selected and then w is randomly selected from W,. 
More generally, if (X1X2...X,,Dx,Xx....x,) is formed by iteratively join- 


ing fibers (see above; we also use the notation from above), then we write 


prob(B(a1,..., tp) =1l: a1 — X1, 22 — X2,2,, 

3 — X30, 29) ene Ur — XP Sain hea) 
instead of px,x....x,({(@1,---,@r) | B(a1,...,2,-) = 1}). Again we write 
more precisely “ (or &) instead of < if the distribution is the uniform one 
(or not clear from the context). 

The distribution 7; — Xj,x,...;_, is the conditional distribution of x; € 
Xj,c1...c;-1, A8SuMINg @1,...,Lj-1; Le., it gives the conditional probabilities 
prob(a;|21,...,@;-1). We have (for r = 3) that 


prob(B(21, v2, @3) = 1:4, — X1, tq — X2,.2,,03 — X3,2,2.) 


a S- prob(21) - prob(x2|x1) - prob(a3 | x2, 71). 


(x1,€2,23):B(x1,"2,03)=1 


We now derive a couple of elementary results. They are used in our com- 
putations with probabilities. 


Proposition B.8. Let X be a finite probability space, and let X be the dis- 
joint union of events €),...,E, C X, with prob(E;) > 0 fori =1...r. Then 


prob(.A) = S- prob(€;) - prob(A|€;) 
i=1 


for every event AC X. 
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prob(A) = S© prob(An €;) = }— prob(€;) « prob(A|&;). 
i=l w=1 


Lemma B.9. Let A,B,E C X be events in a probability space X, with 
prob(€) > 0. Suppose that A and B have the same conditional probability 
assuming €, i.e., prob(A|E€) = prob(B|E). Then 


|prob(A) — prob(B)| < prob(X \ €). 
Proof. By Proposition B.8, we have 
prob(A) = prob(€) - prob(A|€) + prob(X \ €) - prob(A|X \ €) 


and an analogous equality for prob(B). Subtracting both equalities, we get 
the desired inequality. 


Lemma B.10. Let A,E C X be events in a probability space X, with 
prob(€) > 0. Then 


| prob(.A) — prob(A|E) | < prob(X \ €). 
Proof. By Proposition B.8, we have 


prob(A) = prob(€) - prob(A|€) + prob(X \ €) - prob(A|X \ €). 


Hence, 
| prob(A) — prob(A]€) | 
= prob(X \ €) - | prob(A|X \ €) — prob(A]E) | 
< prob(X \ &), 
as desired. 


We continue with two results on expected values of random variables. 


Proposition B.11. Let R and S be real-valued random variables: 
1. E(R+S) = E(R) 4+ E(S). 
2. If R and S are independent, then E(R-S) = E(R)- E(S). 


Proof. Let Wr,Ws C R be the images of R and S. Since we consider only 
finite probability spaces, Wr and Wg are finite sets. We have 
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E(R+S8)=  S° prob(R=2,S =y)- (x+y) 
ceEWrRyeWs 
= S- prob(R=2,S=y)-2 
cEeWrR,yeWs 


+ S- prob(R=2,S=y)-y 


zeWr,yews 
= S- prob(R = «x)-x+ S- prob(S = y)-y 
xreWrR yews 
= E(R) + E(S), 
and if R and S are independent, 
E(R-S)= S- prob(R=2,S=y)-x-y 
ceWrR yews 
- Se prob(R = x)- prob(S = y)-a-y 
ceWryews 
= 6s prob 0) . x prob(S = y)-y 
xreWrR yeWs 
= E(R)- E(S), 


as desired. 


A probability space X is the model of a random experiment. n indepen- 
dent repetitions of the random experiment are modeled by the direct product 
X” = Xx...x X. The following lemma answers the question as to how often 
we have to repeat the experiment until a given event is expected to occur. 


Lemma B.12. Let € be an event in a probability space X, with prob(€) = 
p > 0. Repeatedly, we execute the random experiment X independently. Let 
G be the number of executions of X, until E occurs the first tume. Then the 
expected value E(G) of the random variable G is 1/p. 


Proof. We have prob(G = t) = p- (1 — p)*~!. Hence, 


= d= di 1 
E(G) =Sot-p-(—p)** =-p-— So. -p)' =-p- =-. 
t=1 dp = dpp p 


Remark. G is called the geometric random variable with respect to p. 


Lemma B.13. Let R,S and B be jointly distributed random variables with 
values in {0,1}. Assume that B and S' are independent and that B is uni- 
formly distributed: prob(B = 0) = prob(B = 1) = 1/2. Then 


1 
prob(R = S) = 5 + prob(R = B|.S = B) — prob(R = B). 


B.2 The Weak Law of Large Numbers 333 


Proof. We denote by B the complementary value 1—B of B. First we observe 
that prob(S = B) = prob(S = B) = 1/2, since B is uniformly distributed 
and independent of S: 


prob(.S' = P) 

= prob(S = 0)- aa = 0|S = 0) + prob(S = 1)- prob(B = 1|S = 1) 
= prob(S = 0) - prob(B = 0) + prob(S = 1) - prob(B = 1) 

= (prob(.S = 0) + prob(S = 1)) - ; = = 


Now, we compute 


prob(R = S) 
= 5 (prob(R = B|S = B) + 1—prob(R = B|S = B)) 
oe BIS = Bo peb(R= Bis = B)) 
1 gah 
prob(R = B) — prob(S = B)- prob(R = B|S = B) 
prob(S' = B) 
Ty. dl 
— (2: prob(R = B) — prob(R = B|S = B))) 
2 s+ pO R= BIS = Ba pobR = Bb): 


and the lemma is proven. 


B.2 The Weak Law of Large Numbers 


The variance of a random variable S measures how much the values of S on 
average differ from the expected value. 


Definition B.14. Let S$ be a real-valued random variable. The expected 
value E((.S — E(S))?) of (S — E(S))? is called the variance of S or, for short, 
Var(S). 


If the variance is small, the probability that the value of S is far from E(S) 
is low. This fundamental fact is stated precisely in Chebyshev’s inequality: 
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Theorem B.15 (Chebyshev’s inequality). Let S be a real-valued random 


variable with expected value a and variance a7. Then for every ¢ > 0, 
ge 


prob(|S — a] >e) < as 


Proof. Let Ws C R be the image of S. Since we consider only finite proba- 
bility spaces, Wg is a finite set. Computing 


o* = E((S—a)*) = S- prob($ = x) - (a — a)? 


reEWws 


> » prob($ = x) +e? =e7- prob(|S — al >), 


zeWs,|x—al>e 


we obtain Chebyshev’s inequality. 


Proposition B.16 (weak law of large numbers). Let Sy,...,S, be pairwise- 
independent real-valued random variables, with a common expected value and 
a common variance: E(S;) = a and Var(S;) = 07,i=1,...,t 


Then for every € > 0, 


1 
prob ( = 


Proof. Let Z = 1 S;. Then we have E(Z) = a and 


-2((Jdi6-0) 


= pt S-(S: - a)? aI S (Si — a) (S; — a) 
i tj 


Var(Z) 


I 

pS 

N 
| 


= > AU(S:— a)?) + B((Si — a+) - (5; ~ 0) 


i=l Ew) 


Here observe that for i 4 j, E((S; — a) - (S$; —a)) = E(S; — a)- E(S; — a), 
since S; and S'; are independent, and E(S;—a) = E(S;) —a = 0 (Proposition 
B.11). 

From Chebyshev’s inequality (Theorem B.15), we conclude that 


o2 


b(|Z — al > e) < — 
prob(|Z— ale) < 3, 


and the weak law of large numbers is proven. 
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Remark. In particular, the weak law of large numbers may be applied to t 
executions of the random experiment underlying a random variable S. It says 
that the mean of the observed values of S comes close to the expected value 
of S if the executions are pairwise-independent (or even independent) and its 
number ¢ is sufficiently large. 


If S is a binary random variable observing whether an event occurs or 
not, there is an upper estimate for the variance, and we get the following 
corollary. 


Corollary B.17. Let S),..., 5; be pairwise-independent binary random vari- 
ables, with a common expected value and a common variance: E(S;) = a and 
Var(S;) = 07,i=1,...,t. Then for every « > 0, 


1 
mo (i Ys-0 <e) 21 Hee’ 


Proof. Since S$; is binary, we have $; = S? and get 


Var(S;) = E((S; — a)*) = E(S? — 208; + a”) 


1 
= E(S?) — 2aE(S;) + a? = a— 207 +07 =a(1—a) < re 
Now apply Proposition B.16. 
Corollary B.18. Let S,..., 5; be pairwise-independent binary random vari- 


ables, with a common expected value and a common variance: E(S;) = a and 
Var(S;) = 07,i=1,...,t. Assume E(S;) =a =1o+e,e>0. Then 


1 
prob (a5 > 5) =a ee reare 


Proof. 


1 t 
feat S; — a 


We conclude from Corollary B.17 that 


t t 
t 1 
pon (3 si>§) > mon(|E9 S;-—a 
i=1 j=1 


and the corollary is proven. 


t 
1 1 
h -S i> 
RO ZS 


1 
FS ee ean 
-) = Ate? 
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B.3 Distance Measures 


We define the distance between probability distributions. Further, we prove 
some statements on the behavior of negligible probabilities when the proba- 
bility distribution is varied a little. 


Definition B.19. Let p and p be probability distributions on a finite set X. 
The statistical distance between p and p is 


dist (p, p) := : S- [pal =ple). 
rex 


Remark. The statistical distance defines a metric on the set of distributions 
on X. 


Lemma B.20. The statistical distance between probability distributions p and 
p ona finite set X is the maximal distance between the probabilities of events 
in X, 1.€., 


dist(p, p) = — | p(E) — plE) |. 
Proof. Recall that the events in X are the subsets of X. Let 
Ey:= {xe X | p(x) > p(x)}, €2:= {xe X | p(x) < H(2)}, 


Ez := {x € X | p(x) = p(z)}. 
Then 


and p(€3) — p(€3) = 0. Hence, p(€2) — p(E2) = —(p(E1) — p(E1)), and we 
obviously get 


max | p(€) — p(E) | = p(E1) — p(E1) = —(p(E2) — B(E2)). 


ECX 


We compute 


dist(p,B) = 5D |(e) — a(n) 


crEX 
a ; 63 (p(a) — p(a)) — y (p(x) - 710) 
mely EE 


l 


5 (p(Ex) — BLEx) — (p(E2) ~ H(E2))) 


= max | p(€) ~ B(E)|. 


The lemma follows. 
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Lemma B.21. Let (XW, pxw) be a joint probability space (see Section B.1, 

p. 827 and p. 328). Let px be the induced distribution on X, and prob(w|x) 

be the conditional probability of w, assuming « (x © X,w © W,.). Let px be 

another distribution on X. Setting prob(x,w) := px(«) - prob(w|x), we get 

another probability distribution pxyw on XW (see Section B.1). Then 
dist(pxw,Pxw) < dist(px, px). 

Proof. We have 


Ipxw (2, w) —Pxw(2,w)| = |(px (2) — px (x))-prob(w|«)| < |px(x) —px(2)I, 


and the lemma follows immediately from Definition B.19. 


Throughout the book we consider families of sets, probability distributions 
on these sets (often the uniform one) and events concerning maps and proba- 
bilistic algorithms between these sets. The index sets J are partitioned index 
sets with security parameter k, J = U,en Je, usually written as J = (Jp)een 
(see Definition 6.2). The indexes 7 € J are assumed to be binarily encoded, 
and k is a measure for the binary length |j| of an index j. Recall that, by def- 
inition, there is an m € N, with k!/™ < |j| < k™ for 7 € Jp. As an example, 
think of the family of RSA functions (see Chapters 3 and 6): 

(RSAy ge Zi Zi 5 CES) oclers 
where I, = {(n,e) | n = pq, p # q primes, |p| = |g| = k,e prime to y(n)}. 
We are often interested in asymptotic statements, i.e., statements holding for 
sufficiently large k. 


Definition B.22. Let J = (Jx)xen be an index set with security parameter 
k, and let (X;)je7 be a family of sets. Let p = (p;)je7 and p = (p;)j;e7 be 
families of probability distributions on (Xj) je. 
p and p are called polynomially close,? if for every positive polynomial P 
there is a kg € N, such that for all k > ko and 7 € Jy 
: c 1 
dist(p;, Bj) < =: 


(k) 
Remarks: 


1. “Polynomially close” defines an equivalence relation between distribu- 
tions. 

2. Polynomially close distributions cannot be distinguished by a statistical 
test implemented as a probabilistic polynomial algorithm (see [Luby96], 
Lecture 7). We do not consider probabilistic polynomial algorithms in 
this appendix. Probabilistic polynomial statistical tests for pseudoran- 
dom sequences are studied in Chapter 8 (see, e.g., Definition 8.2). 


? Also called (¢)-statistically indistinguishable (see, e.g., [Luby96]) or statistically 
close (see, e.g., [Goldreich01)). 
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The following lemma gives an example of polynomially close distributions. 


Lemma B.23. Let Jy := {n|n=rs,r,s primes, |r| = |s| =k,r 4 s} ° and 
J = Uren Je. The distributions x “Z, anda“ Z* are polynomially close. 
In other words, uniformly choosing any x from Z,y, is polynomially close to 
choosing only units. 


Proof. Let py be the uniform distribution on Z,, and let p,, be the distribution 
a & Z*. Then pp(x) = Yn for all c € Zn, fn(x) = Yo(n) if x € Zi, 
and n(x) = 0 if e € Z, \ Z. We have |Zi| = y(n) = n[] pin where 
the product is taken over the distinct primes p dividing n (Corollary A.30). 
Hence, 


dist(Pn, fn) = >> | Pn() — Pn(2) | 
eg. 
1 1 1 1 as 
= —————- —- — =] 
s( So (sa-a)+ Da i-]] 
xeZ* LELZn\Z*, p|n 
Ifn=rs € Jz, then 
. 7 (r —1)(s—1) dead 1 1 1 
dist n> Pn =1 = <2: ; 
ist(Pn»Pn) rs rs 78s Qk-1l  gk—2 


and the lemma follows. 


Remark. The lemma does not hold for arbitrary n, i.e., with 
Jp = {n | |n] = i} instead of J;,. Namely, if n has a small prime factor gq, 
then 1—[Tyin * ce = which is not close to 0. 


Example. The distributions x “ Z,, and x ~ Z, Primes are not polynomi- 
ally close.* Their statistical distance is almost 1 (for large n). 

Namely, let k = |n|, and let p; and po be the uniform distributions on Z, 
and Z,, Primes. As usual, we extend p2 by 0 to a distribution on Z,. The 
number (2) of primes < «x is approximately 7/jn(x) (Theorem A.68). Thus, 
we get (with c = In(2)) 


dist (p1, p2) 23 | pi(x (x) | 
2 ea. 


=1 


Sag 2) mew: 


3 If r © N, we denote, as usual, the binary length of r by |r]. 
4 As always, we consider Z, as the set {0,...,n— 1}. 
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and see that the statistical distance is close to 1 for large k. 


Lemma B.24. Let J = (Jx)pen be an index set with security parameter k, 
and let (X;)je7 be a family of sets. Let p = (pj)je7 and p = (pj)jer be 
families of probability distributions on (X;)je7 which are polynomially close. 
Let (E;)je7 be a family of events €; C X;. 
Then for every positive polynomial P, there is a kg € N such that for all 
k>ko 

Ips(Es) — Bi(Es)| S Pty’ 
for all j © Jx. 


Proof. This is an immediate consequence of Lemma B.20. 


Definition B.25. Let J = (Jx)xen be an index set with security parameter 
k, and let (X;)je7 be a family of sets. Let p = (p;)je7 and p = (p;)jez be 
families of probability distributions on (X;)je,7. 

p is polynomially bounded by p if there is a positive polynomial Q such that 
p;(x)Q(k) > pj (x) for all k €N,j € J, and a € X;. 


Examples. In both examples, let J = (Jx)xen be an index set with security 
parameter k: 


1. Let (X;)j;e7 and (Y;)je,7 be families of sets with Y; C X,,7 € J. Assume 
there is a polynomial Q, such that |Y;|Q(k) > |X,| for all kj © Jr. 
Then the image of the uniform distributions on (Yj)j¢,7 under the inclu- 
sions Y; C Xj; is polynomially bounded by the uniform distributions on 
(X;)je7- This is obvious, since !/y,) < @(*)/|x,| by assumption. 

For example, x “ Z, M Primes is polynomially bounded by x “ Zn, 
because the number of primes < n is of the order "/k, with k = |n| being 
the binary length of n (by the Prime Number Theorem, Theorem A.68). 

2. Let (X;)je7 and (Y;)je7 be families of sets. Let f = (f; : Yj; —> Xj)jes 
be a family of surjective maps, and assume that for 7 € Jy, each x € X; 
has at most Q(k) pre-images, Q a polynomial. Then the image of the 
uniform distributions on (Yj)j;e7 under f is polynomially bounded by 
the uniform distributions on (X;)je,7. 

Namely, let p, be the uniform probability distribution and ps be the 
image distribution. We have for 7 € J, and x € X; that 


Q(k) . Qh) 


ile) Ser S Tey = O(R)Pul2). 


Proposition B.26. Let J = (Jk)ren be an index set with security parameter 
k. Let (X;)jez be a family of sets. Let p = (p;)j;ey and p = (p;)je7 be 
families of probability distributions on (X;);e7. Assume that p is polynomially 
bounded by p. Let (E;)j;e7 be a family of events E€; C X;, whose probability 
is negligible with respect to p; t.e., for every positive polynomial P there is a 
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ko EN, such that p;(E;) < 1/P(k) fork > ko and j € Je. 
Then the events (E;)j¢7 have negligible probability also with respect to p. 


Proof. There is a polynomial Q, such that p; < Q(k)-p; for 7 € Jp. Now let 
R be a positive polynomial. Then there is some kg, such that for k > ko and 
J € Ik 

1 


and hence p;(€j) < Q(k) - p;(E;) < =~ 


p35 (Ej) < (k)’ 


1 
R(k)Q(k) 


and the proposition follows. 


B.4 Basic Concepts of Information Theory 


Information theory and the classical notion of provable security for encryp- 
tion algorithms go back to Shannon and his famous papers [Shannon48] and 
[Shannon49]. 

We give a short introduction to some basic concepts and facts from infor- 
mation theory that are needed in this book. Textbooks on the subject include 
[Ash65], [Hamming86], [CovTho92] and [GolPeiSch94]. 

Changing the point of view from probability spaces to random variables 
(see Section B.1), all the following definitions and results can be formulated 
and are valid for (jointly distributed) random variables as well. 


Definition B.27. Let X be a finite probability space. The entropy or un- 
certainty of X is defined by 


H(X):= S*__ prob(2) - logy ( 


«€X,prob(«)4~0 


ancy) 


= ye y prob(2) - logy(prob(x)). 


«EX ,prob(«)4~0 


Remark. The probability space X models a random experiment. The possi- 
ble outcomes of the experiments are the elements x € X. If we execute the 
experiment and observe the event x, we gain information. The amount of 
information we obtain with the occurrence of « (or, equivalently, our uncer- 
tainty whether «x will occur) — measured in bits — is given by 


ise (<x) Sy Seee(proben)s 


The lower the probability of x, the higher the uncertainty. For example, 
tossing a fair coin we have prob(heads) = prob(tails) = 1/2. Thus, the amount 
of information obtained with the outcome heads (or tails) is 1 bit. If you 
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throw a fair die, each outcome has probability 1/6. Therefore, the amount of 
information associated with each outcome is log,(6) ~ 2.6 bits. 

The entropy H(X) measures the average (i.e., the expected) amount of 
information arising from executing the experiment X. For example, toss a 
coin which is unfair, say prob(heads) = 3/4 and prob(tails) = 1/4. Then we 
obtain log,(4/3) & 0.4 bits of information with outcome heads, and 2 bits 
of information with outcome tails; so the average amount of information 
resulting from tossing this coin — the entropy — is 3/4- log(4/3) + V/4-2 = 0.8 
bits. 


Proposition B.28. Let X be a finite probability space which contains n el- 
ements, X = {#1,...,@y}: 


1.0 < A(X) < log,(n). 

2. H(X) = 0 if and only if there is some x € X with prob(x) = 1 (and 
hence all other elements in X have a probability of 0). 

3. H(X) = logs(n) if and only if the distribution on X is uniform. 


For the proof of Proposition B.28, we need the following technical lemma. 


Lemma B.29. Let (p1,...,Pn) and (qi,---,Qn) be probability distributions 
(i.e., all p; and q; are > 0 and SO" pp; = ay 1G = 1, see Definition B.1). 
Assume pp #0 and q, #0 fork =1,...,n. Then: 


1. 


> Palos (= -) < > Palos (= -). (B.1) 


2. Equality holds in (B.1) if and only if (p1,.--,Pn) = (M1,---;Qn)- 
Proof. Since logy(x) = logs(e) - In(a) and logs(e) > 0, it suffices to prove 


the statements for In instead of logy. We have In(xz) < # —1 for all x, and 
In(x) = « — 1 if and only if « = 1. Therefore, 


In (“) < oe 1, hence p, In (4 ) < qk — Pr, hence 
Pk 


Pk Pk 
yon In (# A) < So (de — px), and this implies 
k=1 
3 pe(In(qx) — n(pe)) < Do ae =10; 
k=1 k=1 k=1 


Obviously, we have equality if and only if %/», = 1 for k =1,...,n, which 
means gy = py fork =1,...,n. 
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Proof (of Proposition B.28). Since —prob(x) - logs(prob(a)) > 0 for every 
x € X, the first inequality in statement 1, H(X) > 0, and statement 2 are 
immediate consequences of the definition of H(X). 

To prove statements 1 and 3, set 


1l<k<n 


Slr 


Priv= prob(xx), dk = 


Applying Lemma B.29 we get 


> Palos (= -)< > Palos (= -) = > Palos) = log(n). 


Equality holds instead of < if and only if 


1 
Pk=& = —,k=1,...,n, 
n 


by statement 2 of Lemma B.29. 


In the following we assume without loss of generality that all elements of 
probability spaces have a probability > 0. 

We consider joint probability spaces XY and will see how to specify the 
amount of information gathered about X when learning Y. We will often 
use the intuitive notation prob(y|«), for « € X and y € Y. Recall that 
prob(y|x) is the probability that y occurs as the second component, if the 
first component is x (see Section B.1, p. 328). 


Definition B.30. Let X and Y be finite probability spaces with joint dis- 
tribution XY. 

The joint entropy H(XY) of X and Y is the entropy of the joint distribution 
XY of X and Y: 


H(XY):=— > prob(«,y) - logy (prob(,y)). 
ceEX,yeY 


Conditioning X over y € Y, we define 
H(X|y) = — S° prob(x|y) - log. (prob(x|y)). 
TEx 
The conditional entropy (or conditional uncertainty) of X assuming Y is 


A(X|Y): = S prob(y) - H(X|y). 
yeY 


The mutual information of X and Y is the reduction of the uncertainty of X 
when Y is learned: 
I(X;Y) = A(X) -— A(X |Y). 
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H(XY) measures the average amount of information gathered by observing 
both X and Y. H(X|Y) measures the average amount of information arising 
from (an execution of) the experiment X, knowing the result of experiment Y. 
I(X;Y) measures the amount of information about X obtained by learning 
Y. 


Proposition B.31. Let X and Y be finite probability spaces with joint dis- 
tribution XY. Then: 


1. H(X|Y)>0. 

2, H(XY) = H(X)+ H(Y |X). 

3. H(XY) < H(X)+H(Y). 

4. H(Y) > H(Y |X). 

5. I(X;Y) = 1(Y;X) = H(X) + H(Y) — H(XY). 
6. I(X;Y) >0. 


Proof. Statement 1 is true since H(X |y) > 0 by Proposition B.28. The other 
statements are a special case of the more general Proposition B.36. 


Proposition B.32. Let X and Y be finite probability spaces with joint dis- 
tribution XY. The following statements are equivalent: 


. X and Y are independent. 

. prob(y|x) = prob(y), forx Ee X andy Ec Y. 

. prob(x|y) = prob(x), forxz Ee X andy EY. 

. prob(a|y) = prob(a|y’), fora € X and y,y’ EY. 
. H(XY) = H(X)+H(Y). 

. A(Y) = H(Y |X). 

I(X;Y) =0. 


RANK wwK 


Proof. The equivalence of statements 1, 2 and 3 is an immediate consequence 
of the definitions of independence and conditional probabilities (see Definition 
B.6). Statement 3 obviously implies statement 4. Conversely, statement 3 
follows from statement 4 using 


prob(z) = a prob(y) - prob(a|y). 
yeY 


The equivalence of the latter statements follows as a special case from Propo- 
sition B.37. 


Next we study the mutual information of two probability spaces condi- 
tioned on a third one. 


Definition B.33. Let X, Y and Z be finite probability spaces with joint 
distribution XYZ. The conditional mutual information I(X;Y |Z) is given 
by 


I(X;Y |Z) = H(X|Z) — H(X|YZ). 
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The conditional mutual information I(X;Y|Z) is the average amount of 
information about X obtained by learning Y, assuming that Z is known. 


When studying entropies and mutual informations for jointly distributed 
finite probability spaces X, Y and Z, it is sometimes useful to consider the 
conditional situation where z € Z is fixed. Therefore we need the following 
definition. 


Definition B.34. Let X, Y and Z be finite probability spaces with joint 
distribution XYZ and z € Z. 


H(X|Y,z) = }° prob(y|z)- H(X|y, 2), 
yeY 


I(X;Y |z) :-= H(X|z) — A(X|Y,2z). 
(See Definition B.30 for the definition of H(X |y, z).) 


Proposition B.35. Let X, Y and Z be finite probability spaces with joint 
distribution XYZ and z € Z: 


1. H(X|YZ) = 0-2 prob(z) - H(X|Y, z). 
2. I(X;Y |Z) = Sieg problz) -1(X;¥ 2). 


Proof. 


H(X|YZ)= Spin y,z)-H(X|y,z) (by Definition B.30) 


= = Spot ) Lprob(y|2) -A(X|y,z) = S| prob(z) -H(X|Y,z). 
This proves statement 1. 
I(X;Y|Z) = oo —A(X|¥Z) 
- Ne : SD meets) H(X|Y,z) (by Def. B.30 and 1.) 


= Dprebis I(X;Y |z). 


This proves statement 2. 


Proposition B.36. Let X, Y and Z be finite probability spaces with joint 
distribution XYZ. 


H(XY|Z) = H(X|Z) + H(Y|XZ). 

H(XY|Z) < H(X|Z)+ A(Y|Z). 

H(Y |Z) > H(Y|XZ). 

I(X;Y |Z) =1(Y;X|Z) = H(X|Z)+ H(Y|Z) — H(XY|Z). 
I(X;Y|Z)>0. 


Ss wh 
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6. I(X;YZ) =1(X;Z) + 1(X;Y|Z). 
7 I(X;YZ) > I(X;Z). 


Remark. We get Proposition B.31 from Proposition B.36 by taking Z := {zo} 
with prob(zo) := 1 and XYZ := XY x Z. 


Proof. 
1. We compute 
H(X|Z)+ A(Y|XZ) 
_ S- prob(z ys prob(2| z) - logs(prob(2| z)) 


I 


zEZ LEX 
~ S>_ prob(2,2) Ty prob(y|«, 2) -log,(prob(y|, 2)) 
rEX,zEZ yey 
= — ¥) prob(z)prob(«, y| z) - log(prob(2|z)) 
L,Y ,z 
—~ > prob(e, z)prob(yl:z, 2) -loga(prob(y| =, 2)) 
L,Y,z 
= ~ YF prob(2,y,2)- (loge(prob(e|2)) + loge ( PPE 
ae prob(2| z) 
= — ¥D prob(z,y, 2) -logs(prob(e, 9|2)) 
L,Y,z 
= —}¢ prob(z) 5 prob(a, y| 2) - logy (prob(, y| z)) 
z x,y 
= H(XY|Z). 
2. We have 
H(X|Z) =->> prob(z Se prob(2|z) - log,(prob(| z)) 
zEZ LEX 
=— 7 prob(z) $$) prob(x, y|z) - log,(prob(x| z)), 
zEZ yEY EX 
H(Y|Z) =— S¢ prob(z) © prob(y|z) - logy (prob(y| z)) 
zEZ yeY 
=— > prob(z ys S- prob(a, y|z) - log,(prob(y| z)). 
E27 rex yeY 


Hence, 
A(X|Z)+ A(Y|Z) 
=- S- prob(z ) 2 probl x,y|z) -logs(prob(a|z)prob(y| z)). 


zEZL 
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By definition, 


H(XY|Z) = —S° prob(z ) 2 prob x,y|z) - logs(prob(a, y| z)). 
z€Z 


Since (prob(z,y|Z))(z,yyexy and (prob(x|z) - prob(y|z))(a,yyexy are 
probability distributions, the inequality follows from Lemma B.29. 

3. Follows from statements 1 and 2. 

4. Follows from the definition of the mutual information, since H(X|YZ) = 
H(XY|Z) — H(Y |Z) by statement 1. 

5. Follows from statements 2 and 4. 

6. 1(X; 27) + 1(X;Y|Z) = A(X) — A(X|Z)4+ A(X|Z) — A(X|YZ) = 
I(X; YZ). 

7. Follows from statements 5 and 6. 


The proof of Proposition B.36 is finished. 


Proposition B.37. Let X, Y and Z be finite probability spaces with joint 
distribution XYZ. The following statements are equivalent: 


1. X and Y are independent assuming z, 1.e., 
prob(a, y|z) = prob(x|z)- prob(y|z), for all (a, y,z) © XYZ. 
2. H(XY |Z) = A(X|Z)+ A(Y |Z). 
3. H(Y |Z) = H(Y|XZ). 
4. I(X;Y|Z) =0. 


Remark. We get the last three statements in Proposition B.32 from Proposi- 
tion B.37 by taking Z := {zo} with prob(zo) := 1 and XYZ := XY x Z. 


Proof. The equivalence of statements 1 and 2 follows from the computation 
in the proof of Proposition B.36, statement 2, by Lemma B.29. The equiv- 
alence of statements 2 and 3 is immediately derived from Proposition B.36, 
statement 1. The equivalence of statements 2 and 4 follows from Proposition 
B.36, statement 4. 


Remark. All entropies considered above may in addition be conditioned on 
some event €. We make use of this fact in our overview about some results 
on “unconditional security” in Section 9.6. There, mutual information of the 
form I(X;Y |Z,€) appears. € is an event in XYZ (i.e., a subset of XYZ) and 
the mutual information is defined as usual, but the conditional probabilities 
assuming € have to be used. More precisely: 


H(X|YZ,€):=— }/) prob(y, 2|€)-prob(x|y, z,€)-logo(prob(«|y, z,€)), 
(w,y,z) EE 
A(X |Z,€):= — S- prob(z|€) - prob(a|z, €) - log(prob(a|z, €)), 


(x,y,z)EE 
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I(X;Y |Z, €) := H(X|Z,€) — H(X|YZ,€). 


Here recall that, for example, prob(y, z|E) = prob(X x {y} x {z}|&) and 
prob(a|y,z,€) = prob ({a} x Y x Z|X x {y} x {z},€) The results on en- 
tropies from above remain valid if they are, in addition, conditioned on €. 
The same proofs apply, but the conditional probabilities assuming € have to 
be used. 
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— one-time pad induced by G, 224 

— permutation model, 29 
pseudorandom bit generator, 200 


— Blum-Blum-Shub generator, 203, 213 
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